2014 Fiscal Year Annual Research Report

統計的に一貫した基準に基づく声質変換手法の構築

Research Project

Project/Area Number	24700166
Research Institution	Nagoya Institute of Technology
Principal Investigator	南角吉彦名古屋工業大学, 工学(系)研究科(研究院), 准教授 (80397497)
Project Period (FY)	2012-04-01 – 2015-03-31
Keywords	声質変換
Outline of Annual Research Achievements	本研究では，ごく少量のデータでスペクトル情報と韻律情報，話速などを統一的に変換する声質変換手法を構築した．従来の声質変換手法が音色を表すスペクトル情報のみに注目していたのに対し，提案法では，声の高さや抑揚・話速など，話者性を含むすべての情報を統一的に扱うため，相互の相関を利用してより高精度な声質変換を実現することができる．評価実験においては，スペクトル情報と基本周波数を同時にモデル化することによって，変換性能が改善されることが示された．また，継続長モデルを含む統計モデルに基づいたスペクトル特徴と話速の同時変換についても有効性が示された．また，近年，音声認識や音声合成で適用されたベイズ基準を声質変換に適用し，ごく少量のデータで瞬時に高精度な変換器を構築するための枠組みを提案した．従来の声質変換で用いられてきた尤度最大化(ML)基準では，モデルパラメータを点推定するため，学習データが少量の際に推定精度が低下するという問題があった．これに対しベイズ基準ではモデルパラメータを周辺化することによって高い汎化性能を得ることができる．また，ベイズ基準ではデータに関する事前情報を利用ことによって，モデルの推定精度を高めることができる．本研究では，この事前分布の設定において，因子分析に基づいて構造化された事前分布を利用する手法を提案した．この手法では，多数の話者の音声データから因子分析の構造を利用して，あらかじめ効率的な話者表現を自動的に取得し，対象話者の音声データがごく少量の場合においても，精度良くモデル化を行うことができる．評価実験において，ベイズ基準の近似である事後確率最大化（MAP）基準に構造化事前分布を用いることにより，客観評価が改善することが示された．

Research Products
(8 results)

All 2014

All Journal Article (2 results) (of which Peer Reviewed: 2 results) Presentation (6 results) (of which Invited: 1 results)

[Journal Article] Spectral modeling with contextual additive structure for HMM-based speech synthesis2014
- Author(s)
  Shinji Takaki, Yoshihiko Nankaku and Keiichi Tokuda
- Journal Title
  
  IEEE Transactions on Audio, Speech, and Language Processing
  
  Volume: Vol. 8, Issue 2 Pages: 229-238
- DOI
  10.1109/JSTSP.2014.2305919
- Peer Reviewed
[Journal Article] Integration of spectral feature extraction and modeling for HMM-based speech synthesis2014
- Author(s)
  Kazuhiro Nakamura, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda
- Journal Title
  
  IEICE TRANSACTIONS on Information & Systems
  
  Volume: vol.E97-D, no.6 Pages: 1438-1448
- Peer Reviewed
[Presentation] ニューラルネットワークに基づく音声合成における生成モデルの利用の検討2014
- Author(s)
  橋本佳，大浦圭一郎，南角吉彦，徳田恵一
- Organizer
  日本音響学会秋季研究発表会
- Place of Presentation
  北海学園大学
- Year and Date
  2014-09-03 – 2014-09-05
[Presentation] 因子分析に基づくHMM音声合成における基底クラスタリングの検討2014
- Author(s)
  吉村建慶，橋本佳，南角吉彦，徳田恵一
- Organizer
  日本音響学会秋季研究発表会
- Place of Presentation
  北海学園大学
- Year and Date
  2014-09-03 – 2014-09-05
[Presentation] H/L型アクセント推定と音響モデリングを統合したHMM音声合成の検討2014
- Author(s)
  神谷翔大，橋本佳，大浦圭一郎，南角吉彦，徳田恵一
- Organizer
  日本音響学会秋季研究発表会
- Place of Presentation
  北海学園大学
- Year and Date
  2014-09-03 – 2014-09-05
[Presentation] 統計的機械学習問題としての音声研究2014
- Author(s)
  南角吉彦
- Organizer
  電子情報通信学会　音声研究会
- Place of Presentation
  ホテル花巻
- Year and Date
  2014-07-24 – 2014-07-26
- Invited
[Presentation] Integration of speaker and pitch adaptive training for HMM-based singing voice synthesis2014
- Author(s)
  Kanako Shirota, Kazuhiro Nakamura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
- Organizer
  2014 IEEE International Conference on Acoustics, Speech, and Signal Processing
- Place of Presentation
  Florence Italy
- Year and Date
  2014-05-06 – 2014-05-09
[Presentation] HMM-based singing voice synthesis and its application to Japanese and English2014
- Author(s)
  Kazuhiro Nakamura, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
- Organizer
  2014 IEEE International Conference on Acoustics, Speech, and Signal Processing
- Place of Presentation
  Florence Italy
- Year and Date
  2014-05-06 – 2014-05-09

2014 Fiscal Year Annual Research Report

統計的に一貫した基準に基づく声質変換手法の構築

Principal Investigator

南角 吉彦 名古屋工業大学, 工学(系)研究科(研究院), 准教授 (80397497)

Research Products

[Journal Article] Spectral modeling with contextual additive structure for HMM-based speech synthesis2014

Author(s)

Journal Title

DOI

[Journal Article] Integration of spectral feature extraction and modeling for HMM-based speech synthesis2014

Author(s)

Journal Title

[Presentation] ニューラルネットワークに基づく音声合成における生成モデルの利用の検討2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 因子分析に基づくHMM音声合成における基底クラスタリングの検討2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] H/L型アクセント推定と音響モデリングを統合したHMM音声合成の検討2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 統計的機械学習問題としての音声研究2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Integration of speaker and pitch adaptive training for HMM-based singing voice synthesis2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] HMM-based singing voice synthesis and its application to Japanese and English2014

Author(s)

Organizer

Place of Presentation

Year and Date

南角吉彦名古屋工業大学, 工学(系)研究科(研究院), 准教授 (80397497)