Usability Improvement of Speech Recognition Based on Classification and Notification of Recognition Error Causes

Research Project

Project/Area Number	17K00224
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Research Field	Perceptual information processing
Research Institution	University of Tsukuba
Principal Investigator	Yamada Takeshi 筑波大学, システム情報系, 准教授 (20312829)
Project Period (FY)	2017-04-01 – 2020-03-31
Project Status	Completed (Fiscal Year 2019)
Budget Amount *help	¥3,250,000 (Direct Cost: ¥2,500,000、Indirect Cost: ¥750,000) Fiscal Year 2019: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000) Fiscal Year 2018: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000) Fiscal Year 2017: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
Keywords	音声認識 / ユーザビリティ / 認識性能推定 / 誤認識原因識別 / 発話特徴 / 変調スペクトル / 認識率推定 / 雑音 / 発話様式 / 音声等認識 / 機械学習 / ユーザインターフェース
Outline of Final Research Achievements	The performance of speech recognition in actual use varies drastically depending on how the user speaks. However, it is difficult for general users to accurately grasp such performance fluctuations. Therefore, in this research, we have developed a method to judge whether or not a user's utterance can be correctly recognized, and if it is judged that it cannot be recognized, classify the cause and notify the user in an easy-to-understand manner. First, in order to realize the judgement and classification, we proposed a method using a modulation spectrum and a deep neural network, and confirmed its effectiveness. Then, an interface for notifying the user of the cause of recognition error was designed and implemented on a PC.
Academic Significance and Societal Importance of the Research Achievements	本研究では、ユーザが発話した音声を正しく認識できるか否かを判断し、認識できないと判断した場合には、その原因を識別してユーザに分かり易く通知する手法を開発した。このような機能は本来ユーザインタフェースの一部として備わっているべきであるが、音声認識においてはこれまで実現していなかった。本研究成果により音声認識のユーザビリティが大きく改善し、音声認識サービスのさらなる普及につながると期待できる。また、本研究を通して既存の技術では認識が難しい音声特徴が明確になり、音声認識技術のさらなる高精度化を図るための指針を得た。

Report

(4 results)

2019 Annual Research Report Final Research Report ( PDF )
2018 Research-status Report
2017 Research-status Report

Research Products
(7 results)

All 2020 2019 2018

All Presentation (7 results) (of which Int'l Joint Research: 3 results)

[Presentation] 発話の時間変動に着目した音声認識誤り区間推定の検討2020
- Author(s)
  舒禹清, 山田武志, 牧野昭二
- Organizer
  日本音響学会2020年春季研究発表会
- Related Report
  2019 Annual Research Report
[Presentation] Classification of causes of speech recognition errors using attention-based bidirectional long short-term memory and modulation spectrum2019
- Author(s)
  Jennifer Santoso, Takeshi Yamada, Shoji Makino
- Organizer
  Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2019
- Related Report
  2019 Annual Research Report
- Int'l Joint Research
[Presentation] BLSTMと変調スペクトルを用いた発話特徴識別の検討2019
- Author(s)
  サントソジェニファー, 山田武志, 牧野昭二
- Organizer
  日本音響学会2019年秋季研究発表会
- Related Report
  2019 Annual Research Report
[Presentation] BLSTMを用いた音声認識誤り区間推定の検討2019
- Author(s)
  舒禹清, 山田武志, 牧野昭二
- Organizer
  日本音響学会2019年秋季研究発表会
- Related Report
  2019 Annual Research Report
[Presentation] Categorizing error causes related to utterance characteristics in speech recognition2019
- Author(s)
  Jennifer Santoso, Takeshi Yamada, Shoji Makino
- Organizer
  RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing 2019 (NCSP'19)
- Related Report
  2018 Research-status Report
- Int'l Joint Research
[Presentation] Novel speech recognition interface based on notification of utterance volume required in changing noisy environment2018
- Author(s)
  Takahiro Goto, Takeshi Yamada, Shoji Makino
- Organizer
  RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing 2018 (NCSP'18)
- Related Report
  2017 Research-status Report
- Int'l Joint Research
[Presentation] 音声認識における誤認識原因通知のための印象評定値推定の検討2018
- Author(s)
  後藤孝宏, 山田武志, 牧野昭二
- Organizer
  日本音響学会2018年春季研究発表会
- Related Report
  2017 Research-status Report

Usability Improvement of Speech Recognition Based on Classification and Notification of Recognition Error Causes

Principal Investigator

Yamada Takeshi 筑波大学, システム情報系, 准教授 (20312829)

¥3,250,000 (Direct Cost: ¥2,500,000、Indirect Cost: ¥750,000)

Report

Research Products

[Presentation] 発話の時間変動に着目した音声認識誤り区間推定の検討2020

Author(s)

Organizer

Related Report

[Presentation] Classification of causes of speech recognition errors using attention-based bidirectional long short-term memory and modulation spectrum2019

Author(s)

Organizer

Related Report

[Presentation] BLSTMと変調スペクトルを用いた発話特徴識別の検討2019

Author(s)

Organizer

Related Report

[Presentation] BLSTMを用いた音声認識誤り区間推定の検討2019

Author(s)

Organizer

Related Report

[Presentation] Categorizing error causes related to utterance characteristics in speech recognition2019

Author(s)

Organizer

Related Report

[Presentation] Novel speech recognition interface based on notification of utterance volume required in changing noisy environment2018

Author(s)

Organizer

Related Report

[Presentation] 音声認識における誤認識原因通知のための印象評定値推定の検討2018

Author(s)

Organizer

Related Report