Project/Area Number |
15K12071
|
Research Category |
Grant-in-Aid for Challenging Exploratory Research
|
Allocation Type | Multi-year Fund |
Research Field |
Perceptual information processing
|
Research Institution | National Institute of Informatics |
Principal Investigator |
Yamagishi Junichi 国立情報学研究所, コンテンツ科学研究系, 准教授 (70709352)
|
Co-Investigator(Renkei-kenkyūsha) |
TAKAKI Shinji 国立情報学研究所, コンテンツ科学研究系, 特任助教 (50735090)
|
Project Period (FY) |
2015-04-01 – 2017-03-31
|
Project Status |
Completed (Fiscal Year 2016)
|
Budget Amount *help |
¥3,380,000 (Direct Cost: ¥2,600,000、Indirect Cost: ¥780,000)
Fiscal Year 2016: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2015: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
|
Keywords | 音声合成 / オーディオブック / 集合知 / 機械学習 / インタラクティブ / ディープラーニング / 音声情報処理 |
Outline of Final Research Achievements |
Nowadays e-book readers have speech synthesis functions and users can enjoy not only reading but also listening to the e-books. If statistical parametric speech synthesis, which can flexibly generate various voice types of synthetic speech in various speaking styles, is combined with the e-book readers, e-books may become a future platform where the users can operate the controls of expression of synthetic speech interactively. For this purpose, we have advanced acoustic modeling techniques by means of factorizations of speech transformation functions. More specifically, we explicitly factorized speaker and emotional transformations and proposed a new adaptation algorithm to transplant emotional transformations estimated from a speaker into another speaker. We also constructed a new system where speaker’s gender and age are factorized. A prototype e-book reader based on proposed speech synthesis techniques was also built for demonstrating the new ideas.
|