2011 Fiscal Year Annual Research Report

ベイズ基準によるHMMに基づく音声合成における動的なパラメータ共有構造選択

Research Project

Project/Area Number	10J10062
Research Institution	Nagoya Institute of Technology
Principal Investigator	橋本佳名古屋工業大学, 大学院・工学研究科, 特別研究員(PD)
Keywords	音声合成 / ベイズ基準 / パラメータ共有構造 / 事前分布
Research Abstract	HMMに基づく音声合成において,パラメータ共有のための決定木構造の選択基準として様々な基準が提案されているが,これらの基準は一般に学習データに対する評価値が最も高くなる決定木構造を最適なパラメータ共有構造として選択する.このため,あらゆるテキストに対して平均的に高い品質の音声を合成することが可能となる.しかし,学習データに対する最適なパラメータ共有構造が合成するテキストにとって最適ではなく,生成するテキストごとに最適なパラメータ共有構造は異なると考えられる.そのため,合成テキストに対して最適なパラメータ共有構造をテキストごとに動的に選択し,高品質な音声合成手法を確立することを目指す.これまでの成果から,事前分布がパラメータ共有構造の選択に大きく影響を与えることが示されたため,適切な事前分布選択方法について検討を行った.複数の話者の学習データを用いることにより,他の話者の学習データを有効に利用することが可能になり,話者に非依存な音声の平均的な特徴を捉えた事前分布を推定することが可能になった.この事前分布を用いることによって,より適切なモデル構造を選択することが可能になり,合成音声の品質を大きく改善することを実験結果から示した.
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 本研究の主目的である,合成テキストに対して最適なパラメータ共有構造をテキストごとに動的に選択し,高品質な音声合成手法の有効性を示し,さらに適切な事前分布の推定方法を提案したことによって,さらなる合成音声の品質改善を実現した.
Strategy for Future Research Activity	テキストごとに適切なパラメータ共有構造を選択することにより,合成音声の品質を大きく改善することを示したが,本手法は従来よりも多大な計算コストを必要となる.実環境においてストレスなく合成音声を生成するためにはパラメータ共有構造選択の高速化が必要であり,合成音声の品質を劣化させずに高速にパラメータ共有構造を選択するようなアルゴリズム,近似手法について今後検討していく必要がある.パラメータ共有構造の構築において,パラメータ共有構造全体をテキストごとに構築していくのではなく,テキストに非依存な構造をあらかじめ構築しておき,テキストに強く依存する部分の構造のみをテキストごとに構築することなどが近似手法として考えられる.

Research Products
(11 results)

All 2012 2011

All Journal Article (1 results) (of which Peer Reviewed: 1 results) Presentation (10 results)

[Journal Article] Speech recognition based on statistical model including multiple phonetic decision trees2011
- Author(s)
  Sayaka Shiota, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda
- Journal Title
  
  Acoustical Science and Technology
  
  Volume: vol.32, no.6 Pages: 236-243
- DOI
  10.1250/ast.32.236
- Peer Reviewed
[Presentation] Face recognition based on separable lattice 2-D HMMs using variational Bayesian method2012
- Author(s)
  Kei Sawada, Akira Tamamori, Kei Hashimoto. Yoshihiko Nankaku, Keiichi Tokuda
- Organizer
  ICASSP 2012
- Place of Presentation
  京都
- Year and Date
  2012-03-30
[Presentation] A model structure integration based on Bayesian framework for speech recognition2012
- Author(s)
  Sayaka Shiota, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda
- Organizer
  ICASSP 2012
- Place of Presentation
  京都
- Year and Date
  2012-03-30
[Presentation] HMM音声合成における変分ベイズ法に基づく線形回帰2012
- Author(s)
  橋本佳, 山岸順一, Peter Bell, Simon King, Steve Renals, 徳田恵一
- Organizer
  日本音響学会春季研究発表会
- Place of Presentation
  横浜
- Year and Date
  2012-03-15
[Presentation] 変分ベイズ法を用いた分離型2次元格子HMMの学習におけるアニーリング制御の適用2012
- Author(s)
  沢田慶, 玉森聡, 橋本佳, 南角吉彦, 徳田恵一
- Organizer
  情報処理学会全国大会
- Place of Presentation
  名古屋
- Year and Date
  2012-03-08
[Presentation] 変分ベイズ法を用いた分離型2次元格子HMMに基づく顔画像認識2011
- Author(s)
  沢田慶, 玉森聡, 橋本佳, 南角吉彦, 徳田恵一
- Organizer
  パターン認識・メディア理解研究会
- Place of Presentation
  長崎
- Year and Date
  2011-11-25
[Presentation] ベイズ音声合成における事前分布とモデル構造の話者間共有2011
- Author(s)
  橋本佳, 南角吉彦, 徳田恵一
- Organizer
  日本音響学会秋季研究発表会
- Place of Presentation
  島根
- Year and Date
  2011-09-22
[Presentation] Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 20112011
- Author(s)
  Kei Hashimoto, Shinji Takaki, Keiichiro Oura, Keiichi Tokuda
- Organizer
  Blizzard Challenge 2011
- Place of Presentation
  Turin, Italy
- Year and Date
  2011-09-02
[Presentation] Multi-speaker modeling with shared prior distributions and model structures for Bayesian speech synthesis2011
- Author(s)
  Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda
- Organizer
  Interspeech 2011
- Place of Presentation
  Florence, Italy
- Year and Date
  2011-08-28
[Presentation] Bayesian speech recognition based on model structure integration2011
- Author(s)
  Sayaka Shiota, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda
- Organizer
  音声研究会
- Place of Presentation
  名古屋
- Year and Date
  2011-06-23
[Presentation] An analysis of machine translation and speech synthesis in speech-to-speech translation system2011
- Author(s)
  Kei Hashimoto, Junichi Yamagishi, William Byrne, Simon King, Keiichi Tokuda
- Organizer
  ICASSP 2011
- Place of Presentation
  Prague, Czech Republic
- Year and Date
  2011-05-26

2011 Fiscal Year Annual Research Report

ベイズ基準によるHMMに基づく音声合成における動的なパラメータ共有構造選択

Principal Investigator

橋本 佳 名古屋工業大学, 大学院・工学研究科, 特別研究員(PD)

Current Status of Research Progress

Reason

Research Products

[Journal Article] Speech recognition based on statistical model including multiple phonetic decision trees2011

Author(s)

Journal Title

DOI

[Presentation] Face recognition based on separable lattice 2-D HMMs using variational Bayesian method2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] A model structure integration based on Bayesian framework for speech recognition2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] HMM音声合成における変分ベイズ法に基づく線形回帰2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 変分ベイズ法を用いた分離型2次元格子HMMの学習におけるアニーリング制御の適用2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 変分ベイズ法を用いた分離型2次元格子HMMに基づく顔画像認識2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] ベイズ音声合成における事前分布とモデル構造の話者間共有2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 20112011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Multi-speaker modeling with shared prior distributions and model structures for Bayesian speech synthesis2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Bayesian speech recognition based on model structure integration2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] An analysis of machine translation and speech synthesis in speech-to-speech translation system2011

Author(s)

Organizer

Place of Presentation

Year and Date

橋本佳名古屋工業大学, 大学院・工学研究科, 特別研究員(PD)