Enhancements for Hybrid and End-to-End Speech Recognition Architectures

Watzel, Tobias

2023

Back
Back to start of result list
Permanent link for displayed object

If you experience problems opening the document, please try this link.

Original title:: Enhancements for Hybrid and End-to-End Speech Recognition Architectures
Translated title:: Erweiterungen für hybride und Ende-zu-Ende Spracherkennungssysteme
Author:: Watzel, Tobias
Year:: 2023
Document type:: Dissertation
Faculty/School:: TUM School of Computation, Information and Technology
Advisor:: Rigoll, Gerhard (Prof. Dr.)
Referee:: Rigoll, Gerhard (Prof. Dr.); Fingscheidt, Tim (Prof. Dr.)
Language:: en
Subject group:: DAT Datenverarbeitung, Informatik
Keywords:: ASR, speech recognition, hybrid, end-to-end
Translated keywords:: automatische Spracherkennung, hybrid, Ende-zu-Ende
TUM classification:: DAT 815
Abstract:: This work introduces enhancements for three well-established model architectures in automatic speech recognition. Firstly, discrete neural quantizers for hybrid approaches are discussed, capable of surpassing continuous systems. Secondly, time-reversed components for attentional models are established, providing beneficial information for standard attentional models. Finally, novel localness and fusion strategies for self-attentional architectures are elaborated, boosting the local context.
Translated abstract:: Diese Arbeit stellt Erweiterungen für drei etablierte Systemarchitekturen der automatischen Spracherkennung vor. Zunächst werden diskrete neuronale Quantisierer für hybride Ansätze erörtert, welche kontinuierliche Systeme übertreffen können. Danach werden zeitverdrehte Komponenten für Attentional-Modelle untersucht, welche hilfreiche zeitliche Informationen liefern. Abschließend werden neue lokale Fusionsstrategien in Self-Attentional-Modellen vorgestellt, welche lokale Informationen verstärken.
WWW:: https://mediatum.ub.tum.de/?id=1690600
Date of submission:: 09.11.2022
Oral examination:: 26.04.2023
File size:: 3996303 bytes
Pages:: 168
Urn (citeable URL):: https://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:91-diss-20230426-1690600-1-3
Last change:: 21.06.2023
BibTeX

Occurrences:

mediaTUM Gesamtbestand Elektronische Prüfungsarbeiten Fachgebiet Datenverarbeitung, Informatik

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Computation, Information and Technology Departments Computer Engineering Mensch-Maschine-Kommunikation (Prof. Rigoll)Publication Year 2023

mediaTUM Gesamtbestand Elektronische Prüfungsarbeiten School TUM School of Computation, Information and Technology

mediaTUM Gesamtbestand Elektronische Prüfungsarbeiten School TUM School of Computation, Information and Technology Computer Engineering

mediaTUM Gesamtbestand Hochschulbibliographie 2023 Schools und Fakultäten TUM School of Computation, Information and Technology Mensch-Maschine-Kommunikation (Prof. Rigoll)

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Computation, Information and Technology Prüfungsarbeiten Dissertationen