2013 Fiscal Year Final Research Report
A study on speech diversification techniques based on corpus design for advanced humanoid speech synthesis
Project/Area Number |
23700195
|
Research Category |
Grant-in-Aid for Young Scientists (B)
|
Allocation Type | Multi-year Fund |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Tohoku University (2013) Tokyo Institute of Technology (2011-2012) |
Principal Investigator |
NOSE Takashi 東北大学, 工学(系)研究科(研究院), 講師 (90550591)
|
Project Period (FY) |
2011 – 2012
|
Keywords | 音声合成 / 隠れマルコフモデル / 統計的音声合成 / 感情音声合成 / ヒューマノイドロボット / 音声コーパス |
Research Abstract |
Our goal in this research is to realize more human-like, natural text-to-speech system with various emotional expressions and speaking styles, and the achievements of our studies are as follows: (1)We proposed a novel corpus-design technique in which accent, style, and sentence-final expression are taken into account. (2)We incorporated user's subjective emotional intensities into acoustic model training to improve the performance of expressive speech synthesis. (3)We proposed an automatic labeling technique of emphasis expression using a parameter generation technique of fundamental frequency to realize emphatic speech synthesis. (4)We proposed cross-lingual speech synthesis using only a target speaker's native language speech samples to synthesis multi-lingual speech at a low cost.
|
Research Products
(34 results)