Construction of acoustic model on small training data for dysarthric speech recognition

Research Project

Project/Area Number	20K19862
Research Category	Grant-in-Aid for Early-Career Scientists
Allocation Type	Multi-year Fund
Review Section	Basic Section 61030:Intelligent informatics-related
Research Institution	Kobe University
Principal Investigator	Takashima Ryoichi 神戸大学, 都市安全研究センター, 准教授 (50846102)
Project Period (FY)	2020-04-01 – 2022-03-31
Project Status	Completed (Fiscal Year 2021)
Budget Amount *help	¥3,510,000 (Direct Cost: ¥2,700,000、Indirect Cost: ¥810,000) Fiscal Year 2021: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000) Fiscal Year 2020: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Keywords	音声認識 / 構音障害 / 障害者支援技術 / 機械学習 / ニューラルネットワーク
Outline of Research at the Start	本研究では、構音障害者のコミュニケーション支援を目的とした、構音障害者音声認識システムの開発を行う。構音障害者の音声は健常者の音声と特徴が大きく異なるため、個々の障害者音声に合わせたモデルの構築が必要である。しかし、モデル学習に十分な量の構音障害者音声の収集は困難なため、少量データでのモデル構築が必須である。本研究では、様々な言語、様々な構音障害症状のデータから、共通な構音障害特徴を抽出することで、個々の障害話者から得られるデータが少量でも、個々のデータ不足を補完してモデルの学習を行う手法を検討する。また実証実験を通じて、実用に耐え得るレベルの構音障害者音声認識システムの実現を目指す。
Outline of Final Research Achievements	The goal of this research is to build a dysarthric speech recognition system as a communication tool for dysarthric people. One of the challenges of this research is that it is difficult to collect a sufficient amount of dysarthric speech for training the speech recognition model. In this study, we studied model training methods using transfer learning to use a large amount of normal speech and self-supervised learning to use spontaneous speech of dysarthric people, which is relatively easy to collect. In this study, we proposed multi-step transfer learning, and we proposed a method combining pseudo-labelling and self-supervised learning. We confirmed both of our proposed methods showed better performance than conventional methods.
Academic Significance and Societal Importance of the Research Achievements	本研究は構音障害者の社会的バリアを除去し、社会参加を支援することへの貢献が期待されるものである。運動障害に起因する構音障害者は手足も不自由なために手話などのコミュニケーションの代替手段が取れないケースも多いため、高精度な音声認識の実現が求められている。また本研究で特に焦点を当てている、学習データ不足の問題は構音障害者に限らず音声認識全般にわたって存在する課題であるため、本研究で提案した多段階転移および疑似ラベルと自己教師有り学習に基づく音声認識モデルの学習手法は、音声認識の広い分野において応用可能であると期待している。

Report

(3 results)

2021 Annual Research Report Final Research Report ( PDF )
2020 Research-status Report

Research Products
(16 results)

All 2022 2021 2020 Other

All Presentation (15 results) (of which Int'l Joint Research: 6 results) Remarks (1 results)

[Presentation] Data Augmentation for Dysarthric Speech Recognition Based on Text-to-Speech Synthesis2022
- Author(s)
  Yuki Matsuzaka, Ryoichi Takashima, Chiho Sasaki, Tetsuya Takiguchi
- Organizer
  IEEE LifeTech
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] Adaptation of a Pronunciation Dictionary for Dysarthric Speech Recognition2022
- Author(s)
  Yuya Sawa, Ryoichi Takashima, Tetsuya Takiguchi
- Organizer
  IEEE LifeTech
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] 構音障害者音声認識における自己教師あり学習と疑似ラベリングの動的重み付きマルチタスク学習2022
- Author(s)
  澤佑哉, 相原龍, 高島遼一, 滝口哲也, 今井良枝
- Organizer
  日本音響学会2022年春季研究発表会講演論文集
- Related Report
  2021 Annual Research Report
[Presentation] 構音障害者音声認識におけるテキスト音声合成によるデータ拡張の検討2022
- Author(s)
  松坂勇樹, 高島遼一, 佐々木千穂, 滝口哲也
- Organizer
  日本音響学会2022年春季研究発表会講演論文集
- Related Report
  2021 Annual Research Report
[Presentation] 異なる疾患の障害者音声を用いた器質性構音障害者音声認識モデルの学習2022
- Author(s)
  冨士原健斗, 高島遼一, 杉山千尋, 田中信和, 野原幹司, 野崎一徳, 滝口哲也
- Organizer
  日本音響学会2022年春季研究発表会講演論文集
- Related Report
  2021 Annual Research Report
[Presentation] Data Augmentation Based on Frequency Warping for Recognition of Cleft Palate Speech2021
- Author(s)
  Kento Fujiwara, Ryoichi Takashima, Chihiro Sugiyama, Nobukazu Tanaka, Kanji Nohara, Kazunori Nozaki, Tetsuya Takiguchi
- Organizer
  APSIPA
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] 擬似ラベリングと特徴表現学習を併用した構音障害者音声認識2021
- Author(s)
  澤佑哉，冨士原健斗，相原龍，高島遼一，滝口哲也，今井良枝
- Organizer
  日本音響学会2021年秋季研究発表会講演論文集
- Related Report
  2021 Annual Research Report
[Presentation] 誤り訂正に基づく器質性構音障害者の音声認識精度向上の検討2021
- Author(s)
  冨士原健斗，高島遼一，杉山千尋，田中信和，野原幹司，野崎一徳，滝口哲也
- Organizer
  日本音響学会2021年秋季研究発表会講演論文集
- Related Report
  2021 Annual Research Report
[Presentation] 口唇口蓋裂者の音声認識のためのデータ拡張方式の検討2021
- Author(s)
  冨士原健斗，高島遼一，杉山千尋，田中信和，野原幹司，野崎一徳，滝口哲也
- Organizer
  日本音響学会2021年春季研究発表会講演論文集
- Related Report
  2020 Research-status Report
[Presentation] 自己教師あり学習によるラベル無し自由発話を用いた構音障害者音声認識2021
- Author(s)
  澤佑哉，冨士原健斗，相原龍，高島遼一，滝口哲也，本山信明
- Organizer
  日本音響学会2021年春季研究発表会講演論文集
- Related Report
  2020 Research-status Report
[Presentation] Dysarthric Speech Recognition Based on Deep Metric Learning2020
- Author(s)
  Yuki Takashima, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
- Organizer
  Interspeech
- Related Report
  2020 Research-status Report
- Int'l Joint Research
[Presentation] An Investigation of End-to-End Speech Recognition Using Model Adaptation for Dysarthric Speakers2020
- Author(s)
  Yuya Sawa, Ryoichi Takashima, Tetsuya Takiguchi
- Organizer
  IEEE Global Conference on Consumer Electronics (GCCE)
- Related Report
  2020 Research-status Report
- Int'l Joint Research
[Presentation] Two-Step Acoustic Model Adaptation for Dysarthric Speech Recognition2020
- Author(s)
  Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
- Organizer
  ICASSP
- Related Report
  2020 Research-status Report
- Int'l Joint Research
[Presentation] 構音障害者音声認識における発話辞書適応の検討2020
- Author(s)
  澤佑哉, 高島遼一, 滝口哲也, 有木康雄
- Organizer
  日本音響学会2020年秋季研究発表会講演論文集
- Related Report
  2020 Research-status Report
[Presentation] 構音障害者音声認識における認識モデルの比較評価2020
- Author(s)
  高島遼一, 有木康雄, 滝口哲也
- Organizer
  日本音響学会2020年秋季研究発表会講演論文集
- Related Report
  2020 Research-status Report
[Remarks] 研究者webページ
- URL
  http://www.me.cs.scitec.kobe-u.ac.jp/~rtakashima/
- Related Report
  2021 Annual Research Report 2020 Research-status Report

Construction of acoustic model on small training data for dysarthric speech recognition

Principal Investigator

Takashima Ryoichi 神戸大学, 都市安全研究センター, 准教授 (50846102)

¥3,510,000 (Direct Cost: ¥2,700,000、Indirect Cost: ¥810,000)

Report

Research Products

[Presentation] Data Augmentation for Dysarthric Speech Recognition Based on Text-to-Speech Synthesis2022

Author(s)

Organizer

Related Report

[Presentation] Adaptation of a Pronunciation Dictionary for Dysarthric Speech Recognition2022

Author(s)

Organizer

Related Report

[Presentation] 構音障害者音声認識における自己教師あり学習と疑似ラベリングの動的重み付きマルチタスク学習2022

Author(s)

Organizer

Related Report

[Presentation] 構音障害者音声認識におけるテキスト音声合成によるデータ拡張の検討2022

Author(s)

Organizer

Related Report

[Presentation] 異なる疾患の障害者音声を用いた器質性構音障害者音声認識モデルの学習2022

Author(s)

Organizer

Related Report

[Presentation] Data Augmentation Based on Frequency Warping for Recognition of Cleft Palate Speech2021

Author(s)

Organizer

Related Report

[Presentation] 擬似ラベリングと特徴表現学習を併用した構音障害者音声認識2021

Author(s)

Organizer

Related Report

[Presentation] 誤り訂正に基づく器質性構音障害者の音声認識精度向上の検討2021

Author(s)

Organizer

Related Report

[Presentation] 口唇口蓋裂者の音声認識のためのデータ拡張方式の検討2021

Author(s)

Organizer

Related Report

[Presentation] 自己教師あり学習によるラベル無し自由発話を用いた構音障害者音声認識2021

Author(s)

Organizer

Related Report

[Presentation] Dysarthric Speech Recognition Based on Deep Metric Learning2020

Author(s)

Organizer

Related Report

[Presentation] An Investigation of End-to-End Speech Recognition Using Model Adaptation for Dysarthric Speakers2020

Author(s)

Organizer

Related Report

[Presentation] Two-Step Acoustic Model Adaptation for Dysarthric Speech Recognition2020

Author(s)

Organizer

Related Report

[Presentation] 構音障害者音声認識における発話辞書適応の検討2020

Author(s)

Organizer

Related Report

[Presentation] 構音障害者音声認識における認識モデルの比較評価2020

Author(s)

Organizer

Related Report

[Remarks] 研究者webページ

URL

Related Report