Research on speech synthesis using non-parametric modeling based on Gaussian process regression
Project/Area Number |
25540065
|
Research Category |
Grant-in-Aid for Challenging Exploratory Research
|
Allocation Type | Multi-year Fund |
Research Field |
Perceptual information processing
|
Research Institution | Tokyo Institute of Technology |
Principal Investigator |
KOBAYASHI Takao 東京工業大学, 総合理工学研究科(研究院), 教授 (70153616)
|
Co-Investigator(Kenkyū-buntansha) |
NOSE Takashi 東北大学, 大学院工学研究科, 講師 (90550591)
|
Research Collaborator |
KORIYAMA Tomoki 東京工業大学, 大学院総合理工学研究科, 助教 (50749124)
|
Project Period (FY) |
2013-04-01 – 2015-03-31
|
Project Status |
Completed (Fiscal Year 2014)
|
Budget Amount *help |
¥3,770,000 (Direct Cost: ¥2,900,000、Indirect Cost: ¥870,000)
Fiscal Year 2014: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
Fiscal Year 2013: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
|
Keywords | テキスト音声合成 / 統計的パラメトリック音声合成 / HMM音声合成 / ガウス過程回帰 / カーネル関数 / フレームコンテキスト / 統計的音声合成 / 動的特徴量 / 系列内変動 |
Outline of Final Research Achievements |
The purpose of the research is to develop a framework using non-parametric modeling for synthesizing more natural-sounding speech than the conventional HMM-based statistical parametric speech synthesis framework. The proposed modeling approach is based on Gaussian process regression (GPR) and GPR model is designed for directly predicting frame-level acoustic features from corresponding input linguistic information. We have proposed kernel functions for GPR-based speech synthesis and examined several techniques for computational cost reduction, hyper-parameter optimization, and prosody modeling using Gaussian process classification and GPR.
|
Report
(3 results)
Research Products
(21 results)