Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech Recognition

Kürzinger, Ludwig; Chavez Rosas, Edgar Ricardo; Li, Lujun; Watzel, Tobias; Rigoll, Gerhard

Lujun Li

Zurück
Zurück zum Anfang der Trefferliste
Dauerhafter Link zum angezeigten Objekt

Titel:: Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech Recognition
Dokumenttyp:: Konferenzbeitrag
Autor(en):: Kürzinger, Ludwig; Chavez Rosas, Edgar Ricardo; Li, Lujun; Watzel, Tobias; Rigoll, Gerhard
Abstract:: Recent advances in Automatic Speech Recognition (ASR) demonstrated how end-to-end systems are able to achieve state-of-the-art performance. There is a trend towards deeper neural networks, however those ASR models are also more complex and prone against specially crafted noisy data. Those Audio Adversarial Examples (AAE) were previously demonstrated on ASR systems that use Connectionist Temporal Classification (CTC), as well as attention-based encoder-decoder architectures. Following the idea of the hybrid CTC/attention ASR system, this work proposes algorithms to generate AAEs to combine both approaches into a joint CTC-attention gradient method. Evaluation is performed using a hybrid CTC/attention end-to-end ASR model on two reference sentences as case study, as well as the TEDlium v2 speech recognition task. We then demonstrate the application of this algorithm for adversarial training to obtain a more robust ASR model. «
Recent advances in Automatic Speech Recognition (ASR) demonstrated how end-to-end systems are able to achieve state-of-the-art performance. There is a trend towards deeper neural networks, however those ASR models are also more complex and prone against specially crafted noisy data. Those Audio Adversarial Examples (AAE) were previously demonstrated on ASR systems that use Connectionist Temporal Classification (CTC), as well as attention-based encoder-decoder architectures. Following the idea of... »
Herausgeber:: Karpov, Alexey; Potapova, Rodmonga
Kongress- / Buchtitel:: Speech and Computer
Verlag / Institution:: Springer International Publishing
Verlagsort:: Cham
Jahr:: 2020
Monat:: Sep
Seiten:: 255--266
Print-ISBN:: 978-3-030-60276-5
BibTeX

Vorkommen:

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Computation, Information and Technology Departments Computer Engineering Mensch-Maschine-Kommunikation (Prof. Rigoll)Publication Year 2020

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Computation, Information and Technology Departments Computer Engineering Mensch-Maschine-Kommunikation (Prof. Rigoll)Author Rigoll

mediaTUM Gesamtbestand Hochschulbibliographie 2020 Fakultäten Elektrotechnik und Informationstechnik Mensch-Maschine-Kommunikation (Prof. Rigoll)

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Computation, Information and Technology Departments Computer Engineering Mensch-Maschine-Kommunikation (Prof. Rigoll)Author Lujun Li

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Computation, Information and Technology Departments Computer Engineering Mensch-Maschine-Kommunikation (Prof. Rigoll)Author Watzel

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Computation, Information and Technology Departments Computer Engineering Mensch-Maschine-Kommunikation (Prof. Rigoll)Author Kürzinger