A Study on Algorithms to Improve Intelligibility of Glossectomy Patients' Speech Using Deep Neural Networks

Research Project

Project/Area Number	18K11376
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Review Section	Basic Section 61010:Perceptual information processing-related
Research Institution	Okayama University
Principal Investigator	Abe Masanobu 岡山大学, ヘルスシステム統合科学学域, 教授 (70595470)
Co-Investigator(Kenkyū-buntansha)	原直岡山大学, ヘルスシステム統合科学研究科, 助教 (50402467) 皆木省吾岡山大学, 医歯薬学総合研究科, 教授 (80190693)
Project Period (FY)	2018-04-01 – 2022-03-31
Project Status	Completed (Fiscal Year 2021)
Budget Amount *help	¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000) Fiscal Year 2020: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000) Fiscal Year 2019: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000) Fiscal Year 2018: ¥2,600,000 (Direct Cost: ¥2,000,000、Indirect Cost: ¥600,000)
Keywords	声質変換 / 音声合成 / 舌亜全摘出者 / DNN / 知識蒸留 / 舌がん / 音韻明瞭性の改善 / 口唇情報 / 音韻明瞭せいの改善 / 音声明瞭性 / 深層学習 / 口唇形状 / 舌癌 / マルチモダル
Outline of Final Research Achievements	In this study, we investigate voice conversion algorithms to improve intelligibility of speech uttered by a patient who has articulation disorders because of wide glossectomy and/or segmental mandibulectomy. To achieve real time processing, voice conversion directly modifies waveform using spectrum differential between a healthy speaker and a glossectomy speaker. The spectrum differential is estimated by Deep Neural Networks(DNN). To improve the performance, we proposed to use lip shapes as auxiliary inputs and to introduce knowledge distillation approach to make best use of phoneme labels as auxiliary inputs. Experimental results showed that both approaches work well, and phoneme labels with knowledge distillation has better performance than the usage of lip shapes.
Academic Significance and Societal Importance of the Research Achievements	音声はコミュニケーションの手段としてばかりでなく，人間としての尊厳を保ち豊かな生活を送るうえで重要な役割を果たしている．舌，顎，唇（以下，調音器官）の癌治療のために調音器官を切除して明瞭な音声を発声できなくなることは日常生活に測り知れない損失をもたらす．本研究では，癌治療によって舌を切除したために，音声を明瞭に発声できなくなった患者を対象に，患者が健常であった頃の音声を取り戻すための技術を提案し，その有効性を示した．2017年の国立がん研究センターの推計によれば，口腔・咽頭癌の患者数は約22,800人（癌患者の約２％を占める）であり，これらの患者が声を取り戻せる可能性を示した．

Report

(5 results)

2021 Annual Research Report Final Research Report ( PDF )
2020 Research-status Report
2019 Research-status Report
2018 Research-status Report

Research Products
(9 results)

All 2022 2020 2019 2018 Other

All Presentation (7 results) (of which Int'l Joint Research: 1 results) Remarks (2 results)

[Presentation] 口唇特徴量を利用した知識蒸留による舌亜全摘出者の音韻明瞭度改善法の検討2022
- Author(s)
  高島和嗣，阿部匡伸，原直
- Organizer
  電子情報通信学会技術研究報告
- Related Report
  2021 Annual Research Report
[Presentation] 舌亜全摘出者の音韻明瞭度改善のための推定音素事後確率を用いた声質変換の検討2020
- Author(s)
  荻野聖也，原直，阿部匡伸
- Organizer
  電子情報通信学会総合大会
- Related Report
  2019 Research-status Report
[Presentation] DNN-based Voice Conversion with Auxiliary Phonemic Information to Improve Intelligibility of Glossectomy Patients’ Speech2019
- Author(s)
  Hiroki Murakami, Sunao Hara, Masanobu Abe
- Organizer
  APSIPA Annual Summit and Conference 2019,
- Related Report
  2019 Research-status Report
- Int'l Joint Research
[Presentation] 舌亜全摘出者の音韻明瞭度改善のための Bidirectional LSTM-RNN に基づく音素補助情報を用いた声質変換方式の検討2019
- Author(s)
  村上博紀，原直，阿部匡伸
- Organizer
  日本音響学会2019年春季研究発表会
- Related Report
  2018 Research-status Report
[Presentation] 声質変換による舌亜全摘出者の音韻明瞭度改善のための音素補助情報の推定方式の検討2019
- Author(s)
  荻野聖也，村上博紀，原直，阿部匡伸
- Organizer
  日本音響学会2019年春季研究発表会
- Related Report
  2018 Research-status Report
[Presentation] 音声と口唇形状を用いた声質変換による舌亜全摘出者の音韻明瞭度改善の検討2018
- Author(s)
  荻野聖也，村上博紀，原直，阿部匡伸
- Organizer
  電子情報通信学会技術研究報告
- Related Report
  2018 Research-status Report
[Presentation] 声質変換による舌亜全摘出者の音韻明瞭度改善のための補助情報の検討2018
- Author(s)
  村上博紀，原直，阿部匡伸
- Organizer
  日本音響学会2018年秋季研究発表会
- Related Report
  2018 Research-status Report
[Remarks] ディープニューラルネットワークによる舌亜全摘出者の音韻明瞭性改善の研究
- URL
  https://site-330980-922-1588.mystrikingly.com/
- Related Report
  2021 Annual Research Report
[Remarks] ディープニューラルネットワークによる舌亜全摘出者の音韻明瞭性改善の研究
- URL
  http://site-330980-922-1588.mystrikingly.com/
- Related Report
  2019 Research-status Report

A Study on Algorithms to Improve Intelligibility of Glossectomy Patients' Speech Using Deep Neural Networks

Principal Investigator

Abe Masanobu 岡山大学, ヘルスシステム統合科学学域, 教授 (70595470)

¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000)

Report

Research Products

[Presentation] 口唇特徴量を利用した知識蒸留による舌亜全摘出者の音韻明瞭度改善法の検討2022

Author(s)

Organizer

Related Report

[Presentation] 舌亜全摘出者の音韻明瞭度改善のための推定音素事後確率を用いた声質変換の検討2020

Author(s)

Organizer

Related Report

[Presentation] DNN-based Voice Conversion with Auxiliary Phonemic Information to Improve Intelligibility of Glossectomy Patients’ Speech2019

Author(s)

Organizer

Related Report

[Presentation] 舌亜全摘出者の音韻明瞭度改善のための Bidirectional LSTM-RNN に基づく音素補助情報を用いた声質変換方式の検討2019

Author(s)

Organizer

Related Report

[Presentation] 声質変換による舌亜全摘出者の音韻明瞭度改善のための音素補助情報の推定方式の検討2019

Author(s)

Organizer

Related Report

[Presentation] 音声と口唇形状を用いた声質変換による舌亜全摘出者の音韻明瞭度改善の検討2018

Author(s)

Organizer

Related Report

[Presentation] 声質変換による舌亜全摘出者の音韻明瞭度改善のための補助情報の検討2018

Author(s)

Organizer

Related Report

[Remarks] ディープニューラルネットワークによる 舌亜全摘出者の音韻明瞭性改善の研究

URL

Related Report

[Remarks] ディープニューラルネットワークによる 舌亜全摘出者の音韻明瞭性改善の研究

URL

Related Report

[Remarks] ディープニューラルネットワークによる舌亜全摘出者の音韻明瞭性改善の研究

[Remarks] ディープニューラルネットワークによる舌亜全摘出者の音韻明瞭性改善の研究