2019 Fiscal Year Research-status Report
Next generation multilingual End-to-End speech recognition (from G30 to G200)
Project/Area Number |
19K24376
|
Research Institution | National Institute of Information and Communications Technology |
Principal Investigator |
李 勝 国立研究開発法人情報通信研究機構, 先進的音声翻訳研究開発推進センター先進的音声技術研究室, 研究員 (70840940)
|
Project Period (FY) |
2019-08-30 – 2021-03-31
|
Keywords | speech recognition / end-to-end / language identification / disordered speech / speaker diarization |
Outline of Annual Research Achievements |
In FY2019, I focus on algorithm optimization. I am so happy to find the proposed method is universally available for wide various tasks (multilingual mispronunciation detection, multilingual speech recognition, disordered speech recognition, language identification, and speaker diarization).
There are 2 international papers were accepted in ICASSP2020, 1 joint first author, and 1 corresponding author. 1 co-authored with equal contribution in ICME2020, 1 first author and 1 co-author in Speaker Odyssey2020. There are also 3 domestic presentations were reported in ASJ2020.
In FY2020, we will investigate the core problem of this research topic, that is massive language modeling ability. There are strong influences from cross-language family or same-language family.
|
Current Status of Research Progress |
Current Status of Research Progress
1: Research has progressed more than it was originally planned.
Reason
I am so happy to find the proposed method is universally available for various tasks (speech recognition, disordered speech recognition, language identification, and speaker diarization).
I also revealed the limits of the proposed method by the extensive research in paper of Odyssey2020.
All these are important for future research.
|
Strategy for Future Research Activity |
In the next step, we will touch the core problem of this research topic, that is massive language modeling ability.
Three possible tasks would be accent speech recognition, cross-language family speech recognition, inter-language family speech recognition.
|
Causes of Carryover |
In FY2020 the COVID-19 influenced my research activities. I canceled all of the on-site meeting for the accepted paper. And I focus myself on experiment and new algorithm design.
For the remaining 515,867 JPY, I would buy one more GPU-workstation and proofreading service for the manuscripts.
|