2021 Fiscal Year Research-status Report
Phantom in the Opera: the Vulnerabilities of Speech Interface for Robotic Dialogue System
Project/Area Number |
21K17837
|
Research Institution | National Institute of Information and Communications Technology |
Principal Investigator |
李 勝 国立研究開発法人情報通信研究機構, ユニバーサルコミュニケーション研究所先進的音声翻訳研究開発推進センター, 研究員 (70840940)
|
Project Period (FY) |
2021-04-01 – 2023-03-31
|
Keywords | adversarial attacks / speech recognition / speech enhancement |
Outline of Annual Research Achievements |
Although COVID19, our project is fruitful and concrete as planned. We followed new powerful deep neural network-based models and new attack methods in the last two years. To protect the system from attacks, we are very interested in using existing technologies, e.g., speech enhancement or adaptation, to solve this problem. This year, my research focuses on investigating the potential of speech enhancement. Papers from Journals and top conferences have been accepted in our research. Next year, we will continue to focus on building concrete speech recognition systems with new popular models and attacking methods. Reliable and easy-implement methods, e.g., speech enhancement, will also be investigated to protect the system from adversarial attacks.
|
Current Status of Research Progress |
Current Status of Research Progress
1: Research has progressed more than it was originally planned.
Reason
This year, the progress is as follows: We construct speech recognition systems with recent popular training toolkits and neural network types (accepted in Journals and conferences, e.g., ICASSP2022) We did surveys for the current attack methods. We implement robust adversarial attacks using the Kaldi-based ASR systems. We are also happy to see that this framework can be used to protect sensitive speech content (accepted in LREC2022). To defend against attacks, we find that adversarial audios are very sensitive. Moreover, the feature of its spectrogram is very different from the human voice, and it can be treated as a special kind of noise. We construct speech enhancement systems and study their mechanism this year (accepted in Journals and conferences, e.g., ICASSP2022).
|
Strategy for Future Research Activity |
Next year, we will continue to build concrete speech recognition systems with new popular models and attacking methods with state-of-the-art frameworks, e.g., transformer. To defend against the attacks, we are very interested in using existing technologies, e.g., speech enhancement or adaptation, to solve this problem. Papers from journals and conferences will be expected.
|
Causes of Carryover |
Last year, because of COVID19, all international conferences and academic visiting were canceled. I did not spend the funding, and I mainly did online research activity.
This year, regarding business regularization, I will continue to limit business traveling. So, the funding will be spent on purchasing devices (e.g., spoken dialogue robot, database, musical instrument) and paper publication fees (e.g., books, conferences, and journal papers).
|
-
-
-
-
-
-
-
[Presentation] Compressing Transformer-based ASR Model by Task-driven Loss and Attention-based Multi-level Feature Distillation2022
Author(s)
Y. Lv, L. Wang, M. Ge, S. Li, C. Ding, L. Pan, Y. Wang, J. Dang, K. Honda
Organizer
in Proc. IEEE-ICASSP, pp. 7992--7996, 2022.
Int'l Joint Research
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-