Project/Area Number |
23700195
|
Research Category |
Grant-in-Aid for Young Scientists (B)
|
Allocation Type | Multi-year Fund |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Tohoku University (2013) Tokyo Institute of Technology (2011-2012) |
Principal Investigator |
NOSE Takashi 東北大学, 工学(系)研究科(研究院), 講師 (90550591)
|
Project Period (FY) |
2011 – 2012
|
Project Status |
Completed (Fiscal Year 2013)
|
Budget Amount *help |
¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000)
Fiscal Year 2012: ¥2,080,000 (Direct Cost: ¥1,600,000、Indirect Cost: ¥480,000)
Fiscal Year 2011: ¥2,210,000 (Direct Cost: ¥1,700,000、Indirect Cost: ¥510,000)
|
Keywords | 音声合成 / 隠れマルコフモデル / 統計的音声合成 / 感情音声合成 / ヒューマノイドロボット / 音声コーパス / 統計モデル / 感情音声 / コーパスデザイン / 話し言葉音声合成 / HMM音声合成 / 対話音声合成 / 音声コーパス設計 / 音声パラメータ生成 / スタイル変換 / 歌声合成 |
Research Abstract |
Our goal in this research is to realize more human-like, natural text-to-speech system with various emotional expressions and speaking styles, and the achievements of our studies are as follows: (1)We proposed a novel corpus-design technique in which accent, style, and sentence-final expression are taken into account. (2)We incorporated user's subjective emotional intensities into acoustic model training to improve the performance of expressive speech synthesis. (3)We proposed an automatic labeling technique of emphasis expression using a parameter generation technique of fundamental frequency to realize emphatic speech synthesis. (4)We proposed cross-lingual speech synthesis using only a target speaker's native language speech samples to synthesis multi-lingual speech at a low cost.
|