2014 Fiscal Year Research-status Report

「しゃべって」つくる音声インタラクションシステム

Research Project

Project/Area Number	26540083
Research Institution	Nagoya Institute of Technology
Principal Investigator	徳田恵一名古屋工業大学, 工学(系)研究科(研究院), 教授 (20217483)
Co-Investigator(Kenkyū-buntansha)	李晃伸名古屋工業大学, 工学(系)研究科(研究院), 准教授 (80332766) 南角吉彦名古屋工業大学, 工学(系)研究科(研究院), 准教授 (80397497) 山本大介名古屋工業大学, 工学(系)研究科(研究院), 准教授 (00402470)
Project Period (FY)	2014-04-01 – 2017-03-31
Keywords	音声合成 / 音声認識 / 音声対話 / 音声インタフェース
Outline of Annual Research Achievements	本研究の目的は、音声インタフェースのコンテンツ制作において、コンテンツ製作者が「しゃべる」ことにより、その音声情報を利用してコンテンツを制作できるインタフェースの構築法を確立することである。本研究目的を達成するための研究課題は、(1)音声からの様々な情報の獲得、(2)獲得情報のコンテンツへの反映、(3)実証実験及び有効性の検証、の3つの課題に分類することができる。当該年度は研究課題(1)を中心に進めていくと同時に、音声認識、音声合成、音声対話等の基盤技術の高度化にも取り組んだ。研究課題(1)については、声の大きさ、声の高さ、話速といった音声情報から、話者や感情、強調等の情報を獲得する方法について検討した。因子分析に基づく音声モデルを用いることによって、発話に含まれる様々な情報を低次元の特徴量に抽出することを実現した。このような特徴量を利用、調整することによって様々な声質を再現可能であるため、獲得情報をコンテンツへ再現することが可能となる。また、実際の発話から韻律情報を抽象化して抽出する方法についても検討を行った。これらの手法については、研究課題(2)である獲得情報のコンテンツへの反映についても検討を進めた。実験の結果から、コンテンツ製作者がコンテンツの出力である合成音声の声質を柔軟に変更可能であることを示し、当初の計画以上に課題を進めることができたといえる。今後は、音声からの情報の獲得方法の検討を進めるとともに、獲得情報を柔軟にコンテンツへ反映していくための枠組みについて検討を進める。
Current Status of Research Progress	Current Status of Research Progress 1: Research has progressed more than it was originally planned. Reason 当該年度は研究課題(1)音声から様々な情報を獲得する、を中心として研究課題(2)獲得情報のコンテンツへの反映、について準備を進める計画であったが、研究課題(1)によって獲得した情報をコンテンツへと反映し、出力音声を調整する枠組みを実際に構築し、実験による評価にまで進めることができた。このため、当初の計画以上に進展しているといえる。
Strategy for Future Research Activity	今後は、当該年度に引き続き、研究課題(1)音声からの様々な情報の獲得、及び、研究課題(2)獲得情報のコンテンツへの反映を進める。特に、研究課題(2)については、より多様な表現を可能にするために音声合成技術の高度化に取り組むことで、より多くの情報をコンテンツへ反映する方法を検討していく。さらに、提案法全体の有効性を検証するための実証実験の準備を進める。
Causes of Carryover	当初、モデル学習用計算サーバを購入予定であったが、想定よりも納品に時間がかかったため、次年度処理としたことが理由である。
Expenditure Plan for Carryover Budget	次年度に購入予定であったモデル学習用計算サーバを購入するために使用する。

Research Products
(17 results)

All 2015 2014 Other

All Journal Article (2 results) (of which Peer Reviewed: 2 results) Presentation (9 results) (of which Invited: 2 results) Book (1 results) Remarks (5 results)

[Journal Article] Integration of spectral feature extraction and modeling for HMM-based speech synthesis2014
- Author(s)
  Kazuhiro Nakamura, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda
- Journal Title
  
  IEICE Transactions on Information and Systems
  
  Volume: E97-D Pages: 1438-1448
- DOI
  10.1587/transinf.E97.D.1438
- Peer Reviewed
[Journal Article] Spectral modeling with contextual additive structure for HMM-based speech synthesis2014
- Author(s)
  Shinji Takaki, Yoshihiko Nankaku and Keiichi Tokuda
- Journal Title
  
  IEEE Journal of Selected Topics in Signal Processing
  
  Volume: 8 Pages: 229-238
- DOI
  10.1109/JSTSP.2014.2305919
- Peer Reviewed
[Presentation] The effect of neural networks in statistical parametric speech synthesis2015
- Author(s)
  Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda
- Organizer
  2015 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2015)
- Place of Presentation
  Brisbane, Australia
- Year and Date
  2015-04-19 – 2015-04-24
[Presentation] 統計モデルに基づいた柔軟な音声合成　～人間のように喋る機械の実現を目指して～2014
- Author(s)
  徳田恵一
- Organizer
  音声言語シンポジウム（IEEE Fellow記念講演）
- Place of Presentation
  東京
- Year and Date
  2014-12-15 – 2014-12-15
- Invited
[Presentation] 統計モデルに基づいた音声合成－人間のように喋る機械の実現を目指して－2014
- Author(s)
  徳田恵一
- Organizer
  IEEE Nagoya Section, IEEE Fellow受賞記念講演
- Place of Presentation
  愛知
- Year and Date
  2014-12-13 – 2014-12-13
- Invited
[Presentation] Voice Interaction System with 3D-CG Virtual Agent for Stand-alone Smartphones2014
- Author(s)
  Daisuke Yamamoto, Keiichiro Oura, Ryota Nishimura, Takahiro Uchiya, Akinobu Lee, Ichi Takumi, Keiichi Tokuda
- Organizer
  The 2nd International Conference on Human Agent Interaction (HAI 2014)
- Place of Presentation
  Tsukuba, Japan
- Year and Date
  2014-10-28 – 2014-10-31
[Presentation] Overview of NITECH HMM-based text-to-speech system for Blizzard Challenge 20142014
- Author(s)
  Kei Sawada, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Keiichi Tokuda
- Organizer
  Blizzard Challenge 2014 Workshop
- Place of Presentation
  Singapore
- Year and Date
  2014-09-19 – 2014-09-19
[Presentation] A mel-cepstral analysis technique restoring high frequency components from low-sampling-rate speech2014
- Author(s)
  Kazuhiro Nakamura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda
- Organizer
  Interspeech 2014
- Place of Presentation
  Singapore
- Year and Date
  2014-09-14 – 2014-09-18
[Presentation] H/L型アクセント推定と音響モデリングを統合したHMM音声合成の検討2014
- Author(s)
  神谷翔大, 橋本佳, 大浦圭一郎, 南角吉彦, 徳田恵一
- Organizer
  日本音響学会秋季研究発表会
- Place of Presentation
  北海道
- Year and Date
  2014-09-03 – 2014-09-05
[Presentation] 因子分析に基づくHMM音声合成における基底クラスタリングの検討2014
- Author(s)
  吉村建慶, 橋本佳, 南角吉彦, 徳田恵一
- Organizer
  日本音響学会秋季研究発表会
- Place of Presentation
  北海道
- Year and Date
  2014-09-03 – 2014-09-05
[Presentation] ニューラルネットワークに基づく音声合成における生成モデルの利用の検討2014
- Author(s)
  橋本佳, 大浦圭一郎, 南角吉彦, 徳田恵一
- Organizer
  日本音響学会秋季研究発表会
- Place of Presentation
  北海道
- Year and Date
  2014-09-03 – 2014-09-05
[Book] おしゃべりなコンピュータ ―音声合成技術の現在と未来―2015
- Author(s)
  山岸順一, 徳田恵一, 戸田智基, みわよしこ
- Total Pages
  210
- Publisher
  丸善ライブラリ
[Remarks] 音声対話システム構築ツールキットMMDAgent
- URL
  http://www.mmdagent.jp/
[Remarks] 音声信号処理ツールキットSPTK
- URL
  http://sp-tk.sourceforge.net/
[Remarks] HMM音声合成エンジンhts_engine API
- URL
  http://hts-engine.sourceforge.net/
[Remarks] 日本語テキスト音声合成システムOpen JTalk
- URL
  http://open-jtalk.sourceforge.net/
[Remarks] HMM音声合成ツールキット HTS
- URL
  http://hts.sp.nitech.ac.jp/

2014 Fiscal Year Research-status Report

「しゃべって」つくる音声インタラクションシステム

Principal Investigator

徳田 恵一 名古屋工業大学, 工学(系)研究科(研究院), 教授 (20217483)

Current Status of Research Progress

Reason

Research Products

[Journal Article] Integration of spectral feature extraction and modeling for HMM-based speech synthesis2014

Author(s)

Journal Title

DOI

[Journal Article] Spectral modeling with contextual additive structure for HMM-based speech synthesis2014

Author(s)

Journal Title

DOI

[Presentation] The effect of neural networks in statistical parametric speech synthesis2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 統計モデルに基づいた柔軟な音声合成 ～人間のように喋る機械の実現を目指して～2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 統計モデルに基づいた音声合成 －人間のように喋る機械の実現を目指して－2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Voice Interaction System with 3D-CG Virtual Agent for Stand-alone Smartphones2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Overview of NITECH HMM-based text-to-speech system for Blizzard Challenge 20142014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] A mel-cepstral analysis technique restoring high frequency components from low-sampling-rate speech2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] H/L型アクセント推定と音響モデリングを統合したHMM音声合成の検討2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 因子分析に基づくHMM音声合成における基底クラスタリングの検討2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] ニューラルネットワークに基づく音声合成における生成モデルの利用の検討2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Book] おしゃべりなコンピュータ ―音声合成技術の現在と未来―2015

Author(s)

Total Pages

Publisher

[Remarks] 音声対話システム構築ツールキットMMDAgent

URL

[Remarks] 音声信号処理ツールキットSPTK

URL

[Remarks] HMM音声合成エンジンhts_engine API

URL

[Remarks] 日本語テキスト音声合成システムOpen JTalk

URL

[Remarks] HMM音声合成ツールキット HTS

URL

徳田恵一名古屋工業大学, 工学(系)研究科(研究院), 教授 (20217483)

[Presentation] 統計モデルに基づいた柔軟な音声合成　～人間のように喋る機械の実現を目指して～2014

[Presentation] 統計モデルに基づいた音声合成－人間のように喋る機械の実現を目指して－2014