Budget Amount *help |
¥17,810,000 (Direct Cost: ¥13,700,000、Indirect Cost: ¥4,110,000)
Fiscal Year 2014: ¥5,460,000 (Direct Cost: ¥4,200,000、Indirect Cost: ¥1,260,000)
Fiscal Year 2013: ¥5,590,000 (Direct Cost: ¥4,300,000、Indirect Cost: ¥1,290,000)
Fiscal Year 2012: ¥6,760,000 (Direct Cost: ¥5,200,000、Indirect Cost: ¥1,560,000)
|
Outline of Final Research Achievements |
Research works were conducted with the aim of realizing flexible control of prosody and better speech quality in statistical-based speech synthesis by applying constraints of the generation process model of fundamental frequency (F0) contours. Several methods were developed including one to use F0 contours approximated by the model for HMM training. In the method, hierarchical F0 contours based on the model were treated separately by the multi-stream scheme, leading to a better prosody control keeping clear relations with linguistic information. Lexical emphasis was realized by manipulating the model commands (prosody conversion). Better speaker conversions were realized in multi-speaker case through matrix-variate Gaussian mixture model and deep neural network with speaker-dependent sub-networks. Research works were conducted also for Chinese, with preliminary experiments on speech translation.
|