2020 Fiscal Year Annual Research Report
Next generation multilingual End-to-End speech recognition (from G30 to G200)
Project/Area Number |
19K24376
|
Research Institution | National Institute of Information and Communications Technology |
Principal Investigator |
李 勝 国立研究開発法人情報通信研究機構, 先進的音声翻訳研究開発推進センター 先進的音声技術研究室, 研究員 (70840940)
|
Project Period (FY) |
2019-08-30 – 2021-03-31
|
Keywords | multilingual modeling / low-resourced modeling / speech translation / multi-unit modeling / language identification / disordered speech / code-switched |
Outline of Annual Research Achievements |
In FY2020, I focus on accent speech recognition (English and Chinese), cross-language family speech recognition. Multilingual speech recognition technologies have also been applied to language identification, speaker recognition, disordered speech recognition, and more complex tasks, such as speech translation and adversarial attack.Achievements are as follows: 1. This year's investigation of multilingual modeling technology has been applied to speaker modeling (1 domestic presentation: IEICE-SP), low-resource transfer learning (1 Interspeech SLIMT2020), and speech translation (NLP2021 presentation), language identification (1 journal paper of IEEE-TASLP), and disordered speech recognition (1 Interspeech2020 with grant honor, 1 O-COCOSDA). 2. I also find the acoustic modeling unit selection technology can enhance single-language speech recognition with multi-unit (1 invited full paper on 1 Interspeech SLIMT2020, 1 ICASSP2021) and code-switched speech synthesis (1 Interspeech SLIMT2020, 1 ICONIP paper). 3. Following researches also benefit with the multilingual modeling technologies: speech separation (1 Interspeech2020 with grant honor), adversarial attack (1 IEEE-SLT demo paper), voice-privacy (1 invited report on Interspeech SLIMT2020, 1 Interspeech challenge, 1 ACM-CCS demo), voice activity detection (1 ICASSP2021), Mandarin tone modeling (1 ICASSP2021)
|
Remarks |
The paper urls can be found in these pages.
|