2014 Fiscal Year Annual Research Report

ガウス過程回帰モデルに基づくノンパラメトリック音声合成の研究

Research Project

Project/Area Number	25540065
Research Institution	Tokyo Institute of Technology
Principal Investigator	小林隆夫東京工業大学, 総合理工学研究科(研究院), 教授 (70153616)
Co-Investigator(Kenkyū-buntansha)	能勢隆東北大学, 工学(系)研究科(研究院), 講師 (90550591)
Project Period (FY)	2013-04-01 – 2015-03-31
Keywords	テキスト音声合成 / 統計的音声合成 / ガウス過程回帰 / 動的特徴量 / 系列内変動
Outline of Annual Research Achievements	ガウス過程回帰によるノンパラメトリックモデル化に基づくテキスト音声合成手法の開拓をめざして、研究初年度では、ガウス過程回帰モデルに基づいたスペクトルパラメータ系列の生成手法を提案し、従来の隠れマルコフモデルに基づくモデル化に比べて性能が改善することを示した。本年度は提案手法のさらなる性能向上に重点をおいて研究を実施した。まず、従来の統計的パラメトリック音声合成手法において，生成パラメータの過剰平滑化の抑制に有用性が知られている系列内変動を提案モデル化手法に導入した定式化を行った。これに加えて動的特徴量を考慮したパラメータ生成の定式化も行った。その結果、系列内変動と動的特徴量を導入することで、合成音声のスペクトル歪をさらに減少できることを示した。さらに、これら提案手法で用いる最適なハイパーパラメータの推定手法を提案し、ハイパーパラメータの調整が自動化できることを示した。次に、ガウス過程回帰を用いる統一的な枠組みによる音声合成システムの構築をめざし、音声のスペクトルに加えて韻律のモデル化・パラメータ生成手法の開発にも着手した。ガウス過程分類を利用した有声／無声区間推定、ガウス過程回帰に基づく基本周波数パタンのモデル化とパラメータ生成手法、韻律生成時に有用となるフレームコンテキスト等の基礎的検討を行い、提案音声合成システムを実現できる見通しが得られた。この他に，新たな音声合成手法の枠組みの開発と同時に、読上げ調音声と比べて合成音声の再現がより難しいオーディオブック向け音声と歌唱音声の収録を行い、提案手法の性能評価を行うための基盤整備を行った。本研究で得られた成果を基に、今後はガウス過程回帰を用いる統一的な枠組みによる音声合成システムを実現し、多様な話者性や話し言葉を含む多様なスタイルによる音声合成、多言語音声合成へと，研究を展開して行く予定である。

Research Products
(13 results)

All 2015 2014

All Journal Article (7 results) (of which Peer Reviewed: 4 results, Acknowledgement Compliant: 7 results) Presentation (6 results)

[Journal Article] ガウス過程回帰に基づく音声合成システムの検討2015
- Author(s)
  郡山知樹, 小林隆夫
- Journal Title
  
  日本音響学会2015年春季研究発表会講演論文集
  
  Volume: CD-ROM Pages: 269-270
- Acknowledgement Compliant
[Journal Article] ガウス過程回帰に基づく音声合成のためのコンテキストの検討2015
- Author(s)
  岡元伶洋, 郡山知樹, 小林隆夫
- Journal Title
  
  日本音響学会2015年春季研究発表会講演論文集
  
  Volume: CD-ROM Pages: 371-372
- Acknowledgement Compliant
[Journal Article] Prosody generation using frame-based Gaussian process regression and classification for statistical parametric speech synthesis2015
- Author(s)
  Tomoki Koriyama, Takao Kobayashi
- Journal Title
  
  Proceedings of 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing
  
  Volume: ICASSP 2015 Pages: 4929-4933
- Peer Reviewed / Acknowledgement Compliant
[Journal Article] Statistical parametric speech synthesis based on Gaussian process regression2014
- Author(s)
  Tomoki Koriyama, Takashi Nose, Takao Kobayashi
- Journal Title
  
  IEEE Journal of Selected Topics in Signal Processing
  
  Volume: 8 Pages: 173-183
- DOI
  10.1109/JSTSP.2013.2283461
- Peer Reviewed / Acknowledgement Compliant
[Journal Article] Parametric speech synthesis based on Gaussian process regression using global variance and hyperparameter optimization2014
- Author(s)
  Tomoki Koriyama, Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proceedings of 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing
  
  Volume: ICASSP 2014 Pages: 3862-3866
- DOI
  10.1109/ICASSP.2014.6854319
- Peer Reviewed / Acknowledgement Compliant
[Journal Article] Parametric speech synthesis using local and global sparse Gaussian processes2014
- Author(s)
  Tomoki Koriyama, Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proceedings of IEEE International Workshop on Machine Learning for Signal Processing
  
  Volume: MLSP 2014 Pages: 1-6
- DOI
  10.1109/MLSP.2014.6958921
- Peer Reviewed / Acknowledgement Compliant
[Journal Article] ガウス過程回帰に基づくF0パタン生成の検討2014
- Author(s)
  郡山知樹, 能勢隆, 小林隆夫
- Journal Title
  
  日本音響学会2014年秋季研究発表会講演論文集
  
  Volume: CD-ROM Pages: 247-248
- Acknowledgement Compliant
[Presentation] Prosody generation using frame-based Gaussian process regression and classification for statistical parametric speech synthesis2015
- Author(s)
  Tomoki Koriyama
- Organizer
  2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
- Place of Presentation
  Brisbane Convention & Exhibition Centre（オーストラリア）
- Year and Date
  2015-04-19 – 2015-04-24
[Presentation] ガウス過程回帰に基づく音声合成システムの検討2015
- Author(s)
  郡山知樹
- Organizer
  日本音響学会2015年春季研究発表会
- Place of Presentation
  中央大学後楽園キャンパス（東京）
- Year and Date
  2015-03-16 – 2015-03-18
[Presentation] ガウス過程回帰に基づく音声合成のためのコンテキストの検討2015
- Author(s)
  岡元伶洋
- Organizer
  本音響学会2015年春季研究発表会
- Place of Presentation
  中央大学後楽園キャンパス（東京）
- Year and Date
  2015-03-16 – 2015-03-18
[Presentation] Parametric speech synthesis using local and global sparse Gaussian processes2014
- Author(s)
  Tomoki Koriyama
- Organizer
  International Workshop on Machine Learning for Signal Processing, MLSP2014
- Place of Presentation
  Reims Centre De Congres（フランス）
- Year and Date
  2014-09-21 – 2014-09-24
[Presentation] ガウス過程回帰に基づくF0パタン生成の検討2014
- Author(s)
  郡山知樹
- Organizer
  日本音響学会2014年秋季研究発表会
- Place of Presentation
  北海学園大学豊平キャンパス（北海道）
- Year and Date
  2014-09-03 – 2014-09-05
[Presentation] Parametric speech synthesis based on Gaussian process regression using global variance and hyperparameter optimization2014
- Author(s)
  Tomoki Koriyama
- Organizer
  2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
- Place of Presentation
  "Fortezza Da Basso” Convention & Exhibition Centre （イタリア）
- Year and Date
  2014-05-04 – 2014-05-09

2014 Fiscal Year Annual Research Report

ガウス過程回帰モデルに基づくノンパラメトリック音声合成の研究

Principal Investigator

小林 隆夫 東京工業大学, 総合理工学研究科(研究院), 教授 (70153616)

Research Products

[Journal Article] ガウス過程回帰に基づく音声合成システムの検討2015

Author(s)

Journal Title

[Journal Article] ガウス過程回帰に基づく音声合成のためのコンテキストの検討2015

Author(s)

Journal Title

[Journal Article] Prosody generation using frame-based Gaussian process regression and classification for statistical parametric speech synthesis2015

Author(s)

Journal Title

[Journal Article] Statistical parametric speech synthesis based on Gaussian process regression2014

Author(s)

Journal Title

DOI

[Journal Article] Parametric speech synthesis based on Gaussian process regression using global variance and hyperparameter optimization2014

Author(s)

Journal Title

DOI

[Journal Article] Parametric speech synthesis using local and global sparse Gaussian processes2014

Author(s)

Journal Title

DOI

[Journal Article] ガウス過程回帰に基づくF0パタン生成の検討2014

Author(s)

Journal Title

[Presentation] Prosody generation using frame-based Gaussian process regression and classification for statistical parametric speech synthesis2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] ガウス過程回帰に基づく音声合成システムの検討2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] ガウス過程回帰に基づく音声合成のためのコンテキストの検討2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Parametric speech synthesis using local and global sparse Gaussian processes2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] ガウス過程回帰に基づくF0パタン生成の検討2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Parametric speech synthesis based on Gaussian process regression using global variance and hyperparameter optimization2014

Author(s)

Organizer

Place of Presentation

Year and Date

小林隆夫東京工業大学, 総合理工学研究科(研究院), 教授 (70153616)