A study on speech diversification techniques based on corpus design for advanced humanoid speech synthesis
Project/Area Number |
23700195
|
Research Category |
Grant-in-Aid for Young Scientists (B)
|
Allocation Type | Multi-year Fund |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Tohoku University (2013) Tokyo Institute of Technology (2011-2012) |
Principal Investigator |
NOSE Takashi 東北大学, 工学(系)研究科(研究院), 講師 (90550591)
|
Project Period (FY) |
2011 – 2012
|
Project Status |
Completed (Fiscal Year 2013)
|
Budget Amount *help |
¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000)
Fiscal Year 2012: ¥2,080,000 (Direct Cost: ¥1,600,000、Indirect Cost: ¥480,000)
Fiscal Year 2011: ¥2,210,000 (Direct Cost: ¥1,700,000、Indirect Cost: ¥510,000)
|
Keywords | 音声合成 / 隠れマルコフモデル / 統計的音声合成 / 感情音声合成 / ヒューマノイドロボット / 音声コーパス / 統計モデル / 感情音声 / コーパスデザイン / 話し言葉音声合成 / HMM音声合成 / 対話音声合成 / 音声コーパス設計 / 音声パラメータ生成 / スタイル変換 / 歌声合成 |
Research Abstract |
Our goal in this research is to realize more human-like, natural text-to-speech system with various emotional expressions and speaking styles, and the achievements of our studies are as follows: (1)We proposed a novel corpus-design technique in which accent, style, and sentence-final expression are taken into account. (2)We incorporated user's subjective emotional intensities into acoustic model training to improve the performance of expressive speech synthesis. (3)We proposed an automatic labeling technique of emphasis expression using a parameter generation technique of fundamental frequency to realize emphatic speech synthesis. (4)We proposed cross-lingual speech synthesis using only a target speaker's native language speech samples to synthesis multi-lingual speech at a low cost.
|
Report
(4 results)
Research Products
(106 results)
-
-
-
-
-
-
-
-
-
-
-
-
[Journal Article] HMM-based expressive speech synthesis based on phrase-level F0 context labeling2013
Author(s)
Yu Maeno, Takashi Nose, Takao Kobayashi, Tomoki Koriyama, Yusuke Ijima, Hideharu Nakajima, Hideyuki Mizuno, Osamu Yoshioka
-
Journal Title
Proc. 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
Volume: vol.1
Pages: 7859-7863
Related Report
Peer Reviewed
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-