ベイズ基準によるHMMに基づく音声合成における動的なパラメータ共有構造選択

Research Project

Project/Area Number	10J10062
Research Category	Grant-in-Aid for JSPS Fellows
Allocation Type	Single-year Grants
Section	国内
Research Field	Perception information processing/Intelligent robotics
Research Institution	Nagoya Institute of Technology
Principal Investigator	橋本佳名古屋工業大学, 大学院・工学研究科, 特別研究員(PD)
Project Period (FY)	2010 – 2011
Project Status	Completed (Fiscal Year 2011)
Budget Amount *help	¥1,400,000 (Direct Cost: ¥1,400,000) Fiscal Year 2011: ¥700,000 (Direct Cost: ¥700,000) Fiscal Year 2010: ¥700,000 (Direct Cost: ¥700,000)
Keywords	音声合成 / ベイズ基準 / パラメータ共有構造 / 事前分布
Research Abstract	HMMに基づく音声合成において,パラメータ共有のための決定木構造の選択基準として様々な基準が提案されているが,これらの基準は一般に学習データに対する評価値が最も高くなる決定木構造を最適なパラメータ共有構造として選択する.このため,あらゆるテキストに対して平均的に高い品質の音声を合成することが可能となる.しかし,学習データに対する最適なパラメータ共有構造が合成するテキストにとって最適ではなく,生成するテキストごとに最適なパラメータ共有構造は異なると考えられる.そのため,合成テキストに対して最適なパラメータ共有構造をテキストごとに動的に選択し,高品質な音声合成手法を確立することを目指す.これまでの成果から,事前分布がパラメータ共有構造の選択に大きく影響を与えることが示されたため,適切な事前分布選択方法について検討を行った.複数の話者の学習データを用いることにより,他の話者の学習データを有効に利用することが可能になり,話者に非依存な音声の平均的な特徴を捉えた事前分布を推定することが可能になった.この事前分布を用いることによって,より適切なモデル構造を選択することが可能になり,合成音声の品質を大きく改善することを実験結果から示した.
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 本研究の主目的である,合成テキストに対して最適なパラメータ共有構造をテキストごとに動的に選択し,高品質な音声合成手法の有効性を示し,さらに適切な事前分布の推定方法を提案したことによって,さらなる合成音声の品質改善を実現した.
Strategy for Future Research Activity	テキストごとに適切なパラメータ共有構造を選択することにより,合成音声の品質を大きく改善することを示したが,本手法は従来よりも多大な計算コストを必要となる.実環境においてストレスなく合成音声を生成するためにはパラメータ共有構造選択の高速化が必要であり,合成音声の品質を劣化させずに高速にパラメータ共有構造を選択するようなアルゴリズム,近似手法について今後検討していく必要がある.パラメータ共有構造の構築において,パラメータ共有構造全体をテキストごとに構築していくのではなく,テキストに非依存な構造をあらかじめ構築しておき,テキストに強く依存する部分の構造のみをテキストごとに構築することなどが近似手法として考えられる.

Report

(2 results)

2011 Annual Research Report
2010 Annual Research Report

Research Products
(17 results)

All 2012 2011 2010

All Journal Article (2 results) (of which Peer Reviewed: 2 results) Presentation (15 results)

[Journal Article] Speech recognition based on statistical models including multiple phonetic decision trees2011
- Author(s)
  Sayaka Shiota, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda
- Journal Title
  
  Acoustical Science and Technology
  
  Volume: 32 Issue: 6 Pages: 236-243
- DOI
  10.1250/ast.32.236
- NAID
  130001258012
- ISSN
  0369-4232, 1346-3969, 1347-5177
- Related Report
  2011 Annual Research Report
- Peer Reviewed
[Journal Article] Bayesian context clustering using cross validation for speech recognition2011
- Author(s)
  Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda
- Journal Title
  
  IEICE TRANSACTIONS on Information & Systems
  
  Volume: E94-D Pages: 668-678
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Presentation] Face recognition based on separable lattice 2-D HMMs using variational Bayesian method2012
- Author(s)
  Kei Sawada, Akira Tamamori, Kei Hashimoto. Yoshihiko Nankaku, Keiichi Tokuda
- Organizer
  ICASSP 2012
- Place of Presentation
  京都
- Year and Date
  2012-03-30
- Related Report
  2011 Annual Research Report
[Presentation] A model structure integration based on Bayesian framework for speech recognition2012
- Author(s)
  Sayaka Shiota, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda
- Organizer
  ICASSP 2012
- Place of Presentation
  京都
- Year and Date
  2012-03-30
- Related Report
  2011 Annual Research Report
[Presentation] HMM音声合成における変分ベイズ法に基づく線形回帰2012
- Author(s)
  橋本佳, 山岸順一, Peter Bell, Simon King, Steve Renals, 徳田恵一
- Organizer
  日本音響学会春季研究発表会
- Place of Presentation
  横浜
- Year and Date
  2012-03-15
- Related Report
  2011 Annual Research Report
[Presentation] 変分ベイズ法を用いた分離型2次元格子HMMの学習におけるアニーリング制御の適用2012
- Author(s)
  沢田慶, 玉森聡, 橋本佳, 南角吉彦, 徳田恵一
- Organizer
  情報処理学会全国大会
- Place of Presentation
  名古屋
- Year and Date
  2012-03-08
- Related Report
  2011 Annual Research Report
[Presentation] 変分ベイズ法を用いた分離型2次元格子HMMに基づく顔画像認識2011
- Author(s)
  沢田慶, 玉森聡, 橋本佳, 南角吉彦, 徳田恵一
- Organizer
  パターン認識・メディア理解研究会
- Place of Presentation
  長崎
- Year and Date
  2011-11-25
- Related Report
  2011 Annual Research Report
[Presentation] ベイズ音声合成における事前分布とモデル構造の話者間共有2011
- Author(s)
  橋本佳, 南角吉彦, 徳田恵一
- Organizer
  日本音響学会秋季研究発表会
- Place of Presentation
  島根
- Year and Date
  2011-09-22
- Related Report
  2011 Annual Research Report
[Presentation] Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 20112011
- Author(s)
  Kei Hashimoto, Shinji Takaki, Keiichiro Oura, Keiichi Tokuda
- Organizer
  Blizzard Challenge 2011
- Place of Presentation
  Turin, Italy
- Year and Date
  2011-09-02
- Related Report
  2011 Annual Research Report
[Presentation] Multi-speaker modeling with shared prior distributions and model structures for Bayesian speech synthesis2011
- Author(s)
  Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda
- Organizer
  Interspeech 2011
- Place of Presentation
  Florence, Italy
- Year and Date
  2011-08-28
- Related Report
  2011 Annual Research Report
[Presentation] Bayesian speech recognition based on model structure integration2011
- Author(s)
  Sayaka Shiota, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda
- Organizer
  音声研究会
- Place of Presentation
  名古屋
- Year and Date
  2011-06-23
- Related Report
  2011 Annual Research Report
[Presentation] An analysis of machine translation and speech synthesis in speech-to-speech translation system2011
- Author(s)
  Kei Hashimoto, Junichi Yamagishi, William Byrne, Simon King, Keiichi Tokuda
- Organizer
  ICASSP 2011
- Place of Presentation
  Prague, Czech Republic
- Year and Date
  2011-05-26
- Related Report
  2011 Annual Research Report
[Presentation] 音声翻訳における機械翻訳・音声合成の性能評価および分析2011
- Author(s)
  橋本佳, 山岸順一, Wimam Byrne, Simon King, 徳田恵一
- Organizer
  音響学会春季研究発表会
- Place of Presentation
  東京・早稲田大
- Year and Date
  2011-03-11
- Related Report
  2010 Annual Research Report
[Presentation] 複数のパラメータ共有構造を考慮したベイズ基準による音響モデリングの検討2011
- Author(s)
  塩田さやか, 橋本佳, 南角吉彦, 徳田恵一
- Organizer
  音響学会春季研究発表会
- Place of Presentation
  東京・早稲田大
- Year and Date
  2011-03-09
- Related Report
  2010 Annual Research Report
[Presentation] Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 20102010
- Author(s)
  Keiichiro Oura, Kei Hashimoto, Sayaka Shiota, Keiichi Tokuda
- Organizer
  Blizzard Challenge 2010
- Place of Presentation
  京都・ATR
- Year and Date
  2010-09-25
- Related Report
  2010 Annual Research Report
[Presentation] Bayesian speech synthesis integrating training and synthesis processes2010
- Author(s)
  Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda
- Organizer
  SSW7
- Place of Presentation
  京都・ATR
- Year and Date
  2010-09-23
- Related Report
  2010 Annual Research Report
[Presentation] 学習・合成過程が統合されだベイズ音声合成2010
- Author(s)
  橋本佳, 南角吉彦, 徳田恵一
- Organizer
  音響学会秋季研究発表会
- Place of Presentation
  大阪・関西大
- Year and Date
  2010-09-15
- Related Report
  2010 Annual Research Report

ベイズ基準によるHMMに基づく音声合成における動的なパラメータ共有構造選択

Principal Investigator

橋本 佳 名古屋工業大学, 大学院・工学研究科, 特別研究員(PD)

¥1,400,000 (Direct Cost: ¥1,400,000)

Current Status of Research Progress

Reason

Report

Research Products

[Journal Article] Speech recognition based on statistical models including multiple phonetic decision trees2011

Author(s)

Journal Title

DOI

NAID

ISSN

Related Report

[Journal Article] Bayesian context clustering using cross validation for speech recognition2011

Author(s)

Journal Title

Related Report

[Presentation] Face recognition based on separable lattice 2-D HMMs using variational Bayesian method2012

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] A model structure integration based on Bayesian framework for speech recognition2012

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] HMM音声合成における変分ベイズ法に基づく線形回帰2012

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 変分ベイズ法を用いた分離型2次元格子HMMの学習におけるアニーリング制御の適用2012

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 変分ベイズ法を用いた分離型2次元格子HMMに基づく顔画像認識2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] ベイズ音声合成における事前分布とモデル構造の話者間共有2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 20112011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Multi-speaker modeling with shared prior distributions and model structures for Bayesian speech synthesis2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Bayesian speech recognition based on model structure integration2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] An analysis of machine translation and speech synthesis in speech-to-speech translation system2011

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 音声翻訳における機械翻訳・音声合成の性能評価および分析2011

橋本佳名古屋工業大学, 大学院・工学研究科, 特別研究員(PD)