Research on speech synthesis using non-parametric modeling based on Gaussian process regression

Research Project

Project/Area Number	25540065
Research Category	Grant-in-Aid for Challenging Exploratory Research
Allocation Type	Multi-year Fund
Research Field	Perceptual information processing
Research Institution	Tokyo Institute of Technology
Principal Investigator	KOBAYASHI Takao 東京工業大学, 総合理工学研究科(研究院), 教授 (70153616)
Co-Investigator(Kenkyū-buntansha)	NOSE Takashi 東北大学, 大学院工学研究科, 講師 (90550591)
Research Collaborator	KORIYAMA Tomoki 東京工業大学, 大学院総合理工学研究科, 助教 (50749124)
Project Period (FY)	2013-04-01 – 2015-03-31
Project Status	Completed (Fiscal Year 2014)
Budget Amount *help	¥3,770,000 (Direct Cost: ¥2,900,000、Indirect Cost: ¥870,000) Fiscal Year 2014: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000) Fiscal Year 2013: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Keywords	テキスト音声合成 / 統計的パラメトリック音声合成 / HMM音声合成 / ガウス過程回帰 / カーネル関数 / フレームコンテキスト / 統計的音声合成 / 動的特徴量 / 系列内変動
Outline of Final Research Achievements	The purpose of the research is to develop a framework using non-parametric modeling for synthesizing more natural-sounding speech than the conventional HMM-based statistical parametric speech synthesis framework. The proposed modeling approach is based on Gaussian process regression (GPR) and GPR model is designed for directly predicting frame-level acoustic features from corresponding input linguistic information. We have proposed kernel functions for GPR-based speech synthesis and examined several techniques for computational cost reduction, hyper-parameter optimization, and prosody modeling using Gaussian process classification and GPR.

Report

(3 results)

2014 Annual Research Report Final Research Report ( PDF )
2013 Research-status Report

Research Products
(21 results)

All 2015 2014 2013

All Journal Article (11 results) (of which Peer Reviewed: 5 results, Acknowledgement Compliant: 7 results, Open Access: 1 results) Presentation (10 results)

[Journal Article] ガウス過程回帰に基づく音声合成システムの検討2015
- Author(s)
  郡山知樹, 小林隆夫
- Journal Title
  
  日本音響学会2015年春季研究発表会講演論文集
  
  Volume: CD-ROM Pages: 269-270
- NAID
  120006703848
- Related Report
  2014 Annual Research Report
- Acknowledgement Compliant
[Journal Article] ガウス過程回帰に基づく音声合成のためのコンテキストの検討2015
- Author(s)
  岡元伶洋, 郡山知樹, 小林隆夫
- Journal Title
  
  日本音響学会2015年春季研究発表会講演論文集
  
  Volume: CD-ROM Pages: 371-372
- Related Report
  2014 Annual Research Report
- Acknowledgement Compliant
[Journal Article] Prosody generation using frame-based Gaussian process regression and classification for statistical parametric speech synthesis2015
- Author(s)
  Tomoki Koriyama, Takao Kobayashi
- Journal Title
  
  Proceedings of 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing
  
  Volume: ICASSP 2015 Pages: 4929-4933
- NAID
  120006703851
- Related Report
  2014 Annual Research Report
- Peer Reviewed / Acknowledgement Compliant
[Journal Article] Statistical Parametric Speech Synthesis Based on Gaussian Process Regression2014
- Author(s)
  Tomoki Koriyama, Takashi Nose, Takao Kobayashi
- Journal Title
  
  IEEE Journal of Selected Topics in Signal Processing
  
  Volume: 8 Issue: 2 Pages: 173-183
- DOI
  10.1109/jstsp.2013.2283461
- Related Report
  2014 Annual Research Report
- Peer Reviewed / Acknowledgement Compliant
[Journal Article] Parametric speech synthesis based on Gaussian process regression using global variance and hyperparameter optimization2014
- Author(s)
  Tomoki Koriyama, Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proceedings of 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing
  
  Volume: ICASSP 2014 Pages: 3862-3866
- DOI
  10.1109/icassp.2014.6854319
- NAID
  120006703288
- Related Report
  2014 Annual Research Report
- Peer Reviewed / Acknowledgement Compliant
[Journal Article] Parametric speech synthesis using local and global sparse Gaussian processes2014
- Author(s)
  Tomoki Koriyama, Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proceedings of IEEE International Workshop on Machine Learning for Signal Processing
  
  Volume: MLSP 2014 Pages: 1-6
- DOI
  10.1109/mlsp.2014.6958921
- NAID
  120006703336
- Related Report
  2014 Annual Research Report
- Peer Reviewed / Open Access / Acknowledgement Compliant
[Journal Article] ガウス過程回帰に基づくF0パタン生成の検討2014
- Author(s)
  郡山知樹, 能勢隆, 小林隆夫
- Journal Title
  
  日本音響学会2014年秋季研究発表会講演論文集
  
  Volume: CD-ROM Pages: 247-248
- NAID
  120006703360
- Related Report
  2014 Annual Research Report
- Acknowledgement Compliant
[Journal Article] ガウス過程回帰に基づく音声合成におけるハイパーパラメータ最適化の検討2014
- Author(s)
  郡山知樹, 能勢隆, 小林隆夫
- Journal Title
  
  電子情報通信学会技術研究報告音声
  
  Volume: 113, SP2013-99 Pages: 19-24
- Related Report
  2013 Research-status Report
[Journal Article] 系列内変動を考慮したガウス過程回帰に基づく音声パラメータ生成2014
- Author(s)
  郡山知樹, 能勢隆, 小林隆夫
- Journal Title
  
  日本音響学会2014年春季研究発表会講演論文集
  
  Volume: CD-ROM Pages: 355-356
- NAID
  120006702995
- Related Report
  2013 Research-status Report
[Journal Article] Statistical nonparametric speech synthesis using sparse Gaussian processes2013
- Author(s)
  Tomoki Koriyama, Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proceedings of the 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013
  
  Volume: INTERSPEECH 2013 Pages: 1072-1076
- NAID
  120006702716
- Related Report
  2013 Research-status Report
- Peer Reviewed
[Journal Article] スパース近似と畳み込みカーネルを用いたガウス過程回帰に基づく音声合成2013
- Author(s)
  郡山知樹, 能勢隆, 小林隆夫
- Journal Title
  
  日本音響学会2013年秋季研究発表会講演論文集
  
  Volume: CD-ROM Pages: 311-312
- NAID
  120006702748
- Related Report
  2013 Research-status Report
[Presentation] Prosody generation using frame-based Gaussian process regression and classification for statistical parametric speech synthesis2015
- Author(s)
  Tomoki Koriyama
- Organizer
  2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
- Place of Presentation
  Brisbane Convention & Exhibition Centre（オーストラリア）
- Year and Date
  2015-04-19 – 2015-04-24
- Related Report
  2014 Annual Research Report
[Presentation] ガウス過程回帰に基づく音声合成システムの検討2015
- Author(s)
  郡山知樹
- Organizer
  日本音響学会2015年春季研究発表会
- Place of Presentation
  中央大学後楽園キャンパス（東京）
- Year and Date
  2015-03-16 – 2015-03-18
- Related Report
  2014 Annual Research Report
[Presentation] ガウス過程回帰に基づく音声合成のためのコンテキストの検討2015
- Author(s)
  岡元伶洋
- Organizer
  本音響学会2015年春季研究発表会
- Place of Presentation
  中央大学後楽園キャンパス（東京）
- Year and Date
  2015-03-16 – 2015-03-18
- Related Report
  2014 Annual Research Report
[Presentation] Parametric speech synthesis using local and global sparse Gaussian processes2014
- Author(s)
  Tomoki Koriyama
- Organizer
  International Workshop on Machine Learning for Signal Processing, MLSP2014
- Place of Presentation
  Reims Centre De Congres（フランス）
- Year and Date
  2014-09-21 – 2014-09-24
- Related Report
  2014 Annual Research Report
[Presentation] ガウス過程回帰に基づくF0パタン生成の検討2014
- Author(s)
  郡山知樹
- Organizer
  日本音響学会2014年秋季研究発表会
- Place of Presentation
  北海学園大学豊平キャンパス（北海道）
- Year and Date
  2014-09-03 – 2014-09-05
- Related Report
  2014 Annual Research Report
[Presentation] Parametric speech synthesis based on Gaussian process regression using global variance and hyperparameter optimization2014
- Author(s)
  Tomoki Koriyama
- Organizer
  2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
- Place of Presentation
  "Fortezza Da Basso” Convention & Exhibition Centre （イタリア）
- Year and Date
  2014-05-04 – 2014-05-09
- Related Report
  2014 Annual Research Report
[Presentation] ガウス過程回帰に基づく音声合成におけるハイパーパラメータ最適化の検討2014
- Author(s)
  郡山知樹
- Organizer
  電子情報通信学会・日本音響学会音声研究会
- Place of Presentation
  名城大学天白キャンパス（愛知）
- Related Report
  2013 Research-status Report
[Presentation] 系列内変動を考慮したガウス過程回帰に基づく音声パラメータ生成2014
- Author(s)
  郡山知樹
- Organizer
  日本音響学会2014年春季研究発表会
- Place of Presentation
  日本大学理工学部駿河台キャンパス（東京）
- Related Report
  2013 Research-status Report
[Presentation] Statistical nonparametric speech synthesis using sparse Gaussian processes2013
- Author(s)
  郡山知樹
- Organizer
  14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013
- Place of Presentation
  リヨンコンベンションセンター（フランス）
- Related Report
  2013 Research-status Report
[Presentation] スパース近似と畳み込みカーネルを用いたガウス過程回帰に基づく音声合成2013
- Author(s)
  郡山知樹
- Organizer
  日本音響学会2013年秋季研究発表会
- Place of Presentation
  豊橋技術科学大学（愛知）
- Related Report
  2013 Research-status Report

Research on speech synthesis using non-parametric modeling based on Gaussian process regression

Principal Investigator

KOBAYASHI Takao 東京工業大学, 総合理工学研究科(研究院), 教授 (70153616)

¥3,770,000 (Direct Cost: ¥2,900,000、Indirect Cost: ¥870,000)

Report

Research Products

[Journal Article] ガウス過程回帰に基づく音声合成システムの検討2015

Author(s)

Journal Title

NAID

Related Report

[Journal Article] ガウス過程回帰に基づく音声合成のためのコンテキストの検討2015

Author(s)

Journal Title

Related Report

[Journal Article] Prosody generation using frame-based Gaussian process regression and classification for statistical parametric speech synthesis2015

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Statistical Parametric Speech Synthesis Based on Gaussian Process Regression2014

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Parametric speech synthesis based on Gaussian process regression using global variance and hyperparameter optimization2014

Author(s)

Journal Title

DOI

NAID

Related Report

[Journal Article] Parametric speech synthesis using local and global sparse Gaussian processes2014

Author(s)

Journal Title

DOI

NAID

Related Report

[Journal Article] ガウス過程回帰に基づくF0パタン生成の検討2014

Author(s)

Journal Title

NAID

Related Report

[Journal Article] ガウス過程回帰に基づく音声合成におけるハイパーパラメータ最適化の検討2014

Author(s)

Journal Title

Related Report

[Journal Article] 系列内変動を考慮したガウス過程回帰に基づく音声パラメータ生成2014

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Statistical nonparametric speech synthesis using sparse Gaussian processes2013

Author(s)

Journal Title

NAID

Related Report

[Journal Article] スパース近似と畳み込みカーネルを用いたガウス過程回帰に基づく音声合成2013

Author(s)

Journal Title

NAID

Related Report

[Presentation] Prosody generation using frame-based Gaussian process regression and classification for statistical parametric speech synthesis2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] ガウス過程回帰に基づく音声合成システムの検討2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] ガウス過程回帰に基づく音声合成のためのコンテキストの検討2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Parametric speech synthesis using local and global sparse Gaussian processes2014