2018 Fiscal Year Final Research Report

Speech synthesis based on articulatory movement HMM and LSP digital filter

Research Project

PDF

Project/Area Number	16K00234
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Research Field	Perceptual information processing
Research Institution	Tokyo University of Science
Principal Investigator	Katsurada Kouichi 東京理科大学, 理工学部情報科学科, 准教授 (80324490)
Co-Investigator(Kenkyū-buntansha)	新田恒雄早稲田大学, グリーン・コンピューティング・システム研究機構, その他(招聘研究員) (70314101) 牧野武彦中央大学, 経済学部, 教授 (00269482) 金澤靖豊橋技術科学大学, 工学(系)研究科(研究院), 准教授 (50214432)
Research Collaborator	Kaburagi Tokihiko Wakamiya Kohei
Project Period (FY)	2016-04-01 – 2019-03-31
Keywords	調音運動 / 音声合成 / データベース構築
Outline of Final Research Achievements	We have investigated how to synthesize speeches from articulatory features that represent movement of lip and tongue when humans utter. During the first half of the period, we have constructed a speech synthesizer from the features that parameterize the actual movement of lip and tongue. After that, we have collected the data of lip/tongue movement using EMA (Electromagnetic Articulography). We recorded the movement from a male announcer last year, and now we are labeling IPA (International Phonetic Alphabet) on it.
Free Research Field	音声合成
Academic Significance and Societal Importance of the Research Achievements	近年，深層学習等の発展により音声合成のクオリティが格段に向上している．しかし一般的な音声合成では人間の発音に関する詳細な特徴を用いていないため，人間ならではの発音の失敗や声質の変化に対応することが難しい．本研究で取り組む調音運動ベースの音声合成は人間の発声の仕組みに近い方式をとるため，こうした人間ならではの声の変化に対応できる可能性がある．こうした合成のモデルを他者の発話の認識等に用いることで，言語情報だけではなく，その背後の発声方式の変化（風邪をひいたとか，口の中が痛いとか）を認識する補助情報として利用することも考えられる．