2021 Fiscal Year Final Research Report
Construction of acoustic model on small training data for dysarthric speech recognition
Project/Area Number |
20K19862
|
Research Category |
Grant-in-Aid for Early-Career Scientists
|
Allocation Type | Multi-year Fund |
Review Section |
Basic Section 61030:Intelligent informatics-related
|
Research Institution | Kobe University |
Principal Investigator |
|
Project Period (FY) |
2020-04-01 – 2022-03-31
|
Keywords | 音声認識 / 構音障害 / 障害者支援技術 / 機械学習 / ニューラルネットワーク |
Outline of Final Research Achievements |
The goal of this research is to build a dysarthric speech recognition system as a communication tool for dysarthric people. One of the challenges of this research is that it is difficult to collect a sufficient amount of dysarthric speech for training the speech recognition model. In this study, we studied model training methods using transfer learning to use a large amount of normal speech and self-supervised learning to use spontaneous speech of dysarthric people, which is relatively easy to collect. In this study, we proposed multi-step transfer learning, and we proposed a method combining pseudo-labelling and self-supervised learning. We confirmed both of our proposed methods showed better performance than conventional methods.
|
Free Research Field |
メディア情報処理
|
Academic Significance and Societal Importance of the Research Achievements |
本研究は構音障害者の社会的バリアを除去し、社会参加を支援することへの貢献が期待されるものである。運動障害に起因する構音障害者は手足も不自由なために手話などのコミュニケーションの代替手段が取れないケースも多いため、高精度な音声認識の実現が求められている。また本研究で特に焦点を当てている、学習データ不足の問題は構音障害者に限らず音声認識全般にわたって存在する課題であるため、本研究で提案した多段階転移および疑似ラベルと自己教師有り学習に基づく音声認識モデルの学習手法は、音声認識の広い分野において応用可能であると期待している。
|