2014 Fiscal Year Final Research Report

A study of speech synthesis for achieving synthetic speech with high quality and variability based on hybrid approach

Research Project

Project/Area Number	25730106
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Multi-year Fund
Research Field	Perceptual information processing
Research Institution	Tohoku University
Principal Investigator	NOSE Takashi 東北大学, 工学(系)研究科(研究院), 講師 (90550591)
Project Period (FY)	2013-04-01 – 2015-03-31
Keywords	統計的音声合成 / 非言語情報 / パラ言語情報 / 韻律 / 多言語 / 歌声合成 / パラメータ生成
Outline of Final Research Achievements	The purpose of this research is to establish hybrid speech synthesis framework that can synthesize human-like speech with various emotional expressions and/or speaking styles using only a limited amount of speech data. We achieved the following six issues in this research. (1) Flexible control of non- or para-linguistic information appearing in synthetic speech. (2) Automatic training of prosodic variations, (3)Expansion to the multi-lingual or cross-lingual speech synthesis, (4)Application to singing voice synthesis, (5) Efficient designing of speech corpus for synthesis, and (6) Improving subjective quality of synthetic speech by modifying the conventional parameter generation method .
Free Research Field	音声情報処理