2014 Fiscal Year Final Research Report

Development of statistically consistent voice conversion techniques based on joint feature modeling

Research Project

Project/Area Number	24700166
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Multi-year Fund
Research Field	Perception information processing/Intelligent robotics
Research Institution	Nagoya Institute of Technology
Principal Investigator	NANKAKU Yoshihiko 名古屋工業大学, 工学(系)研究科(研究院), 准教授 (80397497)
Project Period (FY)	2012-04-01 – 2015-03-31
Keywords	声質変換
Outline of Final Research Achievements	This project aimed to improve voice cconversion techniques which convert speech waveforms from original speaker's voice to another speaker's one. In conventional voice conversion technique, spectral features and prosodic features such as fundamental frequencies (F0) and speaking rates are indenedently converted. In the proposed technique, these features are consistently modeled using a single statistical model and all features are jointly converted using the correlation among features. Experimental results showed that the speech quality of converted voices was improved by the proposed technique. Moreover, the project also developed a technique to improve voice conversion with a very small amount of training data is available.
Free Research Field	音声情報処理