研究実績の概要 |
The study of adversarial attacks on speech recognition systems serves as a starting point for our research, aiming to provide speech development researchers with ideas on improving system security in light of the increasingly severe security issues related to speech-based systems. The development of deep neural networks has been progressing rapidly, from basic DNNs to end-to-end models, to the popular self-supervised learning models in recent years, and now to the revolutionary large language models (LLMs). The evolution of speech recognition systems has been incredibly fast. In the first year of this project, we carefully studied the principles of speech recognition systems and researched all possible attack details. We summarized our findings in a review and proposed methods for improving the front-end and back-end of speech recognition systems. Our achievements were accepted by top conferences in the speech field, such as Interspeech and ICASSP. In the project's second year, we discovered that attacks on speech recognition systems are not only limited to the content of speech recognition but also involve more attacks related to speaker attributes. Therefore, we expanded our research scope. While continuing our research on the front-end and back-end of speech recognition systems, we extended adversarial attacks to speaker attributes. We participated in several privacy attack competitions, proposing our attack framework. At the same time, we expanded our research to spoken dialogue systems. All of these efforts have laid the foundation for further research.
|