2019 Fiscal Year Final Research Report

Usability Improvement of Speech Recognition Based on Classification and Notification of Recognition Error Causes

Research Project

PDF

Project/Area Number	17K00224
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Research Field	Perceptual information processing
Research Institution	University of Tsukuba
Principal Investigator	Yamada Takeshi 筑波大学, システム情報系, 准教授 (20312829)
Project Period (FY)	2017-04-01 – 2020-03-31
Keywords	音声認識 / ユーザビリティ / 認識性能推定 / 誤認識原因識別 / 発話特徴 / 変調スペクトル
Outline of Final Research Achievements	The performance of speech recognition in actual use varies drastically depending on how the user speaks. However, it is difficult for general users to accurately grasp such performance fluctuations. Therefore, in this research, we have developed a method to judge whether or not a user's utterance can be correctly recognized, and if it is judged that it cannot be recognized, classify the cause and notify the user in an easy-to-understand manner. First, in order to realize the judgement and classification, we proposed a method using a modulation spectrum and a deep neural network, and confirmed its effectiveness. Then, an interface for notifying the user of the cause of recognition error was designed and implemented on a PC.
Free Research Field	音声情報処理学
Academic Significance and Societal Importance of the Research Achievements	本研究では、ユーザが発話した音声を正しく認識できるか否かを判断し、認識できないと判断した場合には、その原因を識別してユーザに分かり易く通知する手法を開発した。このような機能は本来ユーザインタフェースの一部として備わっているべきであるが、音声認識においてはこれまで実現していなかった。本研究成果により音声認識のユーザビリティが大きく改善し、音声認識サービスのさらなる普及につながると期待できる。また、本研究を通して既存の技術では認識が難しい音声特徴が明確になり、音声認識技術のさらなる高精度化を図るための指針を得た。