2017 Fiscal Year Research-status Report

平均声モーフィングを利用した日本語発音学習システムの研究開発

Research Project

Project/Area Number	16K13253
Research Institution	Tohoku University
Principal Investigator	能勢隆東北大学, 工学研究科, 准教授 (90550591)
Co-Investigator(Kenkyū-buntansha)	千葉祐弥東北大学, 工学研究科, 助教 (30780936)
Project Period (FY)	2016-04-01 – 2019-03-31
Keywords	e-ラーニング / 語学学習支援システム（CALL） / 発音学習 / 統計的パラメトリック音声合成 / 深層学習 / 韻律置換
Outline of Annual Research Achievements	本課題では、日本において非母語話者が日本語の発音学習を「低コストで」「手軽に」「確実に」行えるような新たな枠組の実現を目指す。具体的には複数の教師話者の音声により学習した平均教師声モデルによる統計的パラメトリック音声合成を利用し、音声の音韻や韻律（ピッチ・リズム）を特徴量毎に置換することで、従来よりも詳細で高精度な発音スコアのラベル付けを可能とする。さらに、この技術により発音スコアデータベースを新たに構築する。このデータベースを用いて音韻、アクセント、リズムについて個別に発音スコアの予測モデルを学習し、非母語話者の発音スコアを予測することで、発音学習を効率的に行うことを目指す。さらに、平均声と利用者間で特徴量の段階的な補間を行う平均声モーフィングによる教師音声のフィードバックを行うことで、より着実に正しい発音を身につける方法を提案する。本年度（初年度）は音声合成方式を従来の隠れマルコフモデルに基づく手法から深層学習に基づく手法へと変更し、より高品質な合成音声を生成することを実現した。また、実際に主観評価を行いアクセントとリズムについてのスコアを付与し、それに基づいてサポートベクター回帰に基づく予測実験を行ない、有効性を確認した。
Current Status of Research Progress	Current Status of Research Progress 1: Research has progressed more than it was originally planned. Reason 予定していた深層学習に基づく音声合成による韻律置換、およびそれを用いたスコア付与だけでなく、それを学習データとしてスコア予測まで行うことができたため。
Strategy for Future Research Activity	今後はより多くのデータについてこれまで同様に主観評価によるスコアの付与を行い、スコアの予測精度の向上を目指すとともに、韻律置換後の合成音声の品質改善についても検討を行う。

Research Products
(4 results)

All 2017

All Journal Article (2 results) (of which Int'l Joint Research: 2 results, Peer Reviewed: 2 results) Presentation (2 results) (of which Int'l Joint Research: 2 results)

[Journal Article] Voice Conversion from Arbitrary Speakers Based on Deep Neural Networks with Adversarial Learning2017
- Author(s)
  Sou Miyamoto, Takashi Nose, Suzunosuke Ito, Harunori Koike, Yuya Chiba, Akinori Ito, Takahiro Shinozaki
- Journal Title
  
  Proceeding of the Thirteenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing
  
  Volume: 2 Pages: 97-103
- DOI
  10.1007/978-3-319-63859-1_13
- Peer Reviewed / Int'l Joint Research
[Journal Article] Development and Evaluation of Julius-Compatible Interface for Kaldi ASR2017
- Author(s)
  Yusuke Yamada, Takashi Nose, Yuya Chiba, Akinori Ito and Takahiro Shinozaki
- Journal Title
  
  Proceeding of the Thirteenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing
  
  Volume: 2 Pages: 91-96
- DOI
  10.1007/978-3-319-63859-1_12
- Peer Reviewed / Int'l Joint Research
[Presentation] Voice Conversion from Arbitrary Speakers Based on Deep Neural Networks with Adversarial Learning2017
- Author(s)
  Sou Miyamoto
- Organizer
  The Thirteenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing
- Int'l Joint Research
[Presentation] Development and Evaluation of Julius-Compatible Interface for Kaldi ASR2017
- Author(s)
  Yusuke Yamada
- Organizer
  The Thirteenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing
- Int'l Joint Research

2017 Fiscal Year Research-status Report

平均声モーフィングを利用した日本語発音学習システムの研究開発

Principal Investigator

能勢 隆 東北大学, 工学研究科, 准教授 (90550591)

Current Status of Research Progress

Reason

Research Products

[Journal Article] Voice Conversion from Arbitrary Speakers Based on Deep Neural Networks with Adversarial Learning2017

Author(s)

Journal Title

DOI

[Journal Article] Development and Evaluation of Julius-Compatible Interface for Kaldi ASR2017

Author(s)

Journal Title

DOI

[Presentation] Voice Conversion from Arbitrary Speakers Based on Deep Neural Networks with Adversarial Learning2017

Author(s)

Organizer

[Presentation] Development and Evaluation of Julius-Compatible Interface for Kaldi ASR2017

Author(s)

Organizer

能勢隆東北大学, 工学研究科, 准教授 (90550591)