2022 Fiscal Year Final Research Report

Construction of articulatory movement database, normalization of databases, and speech synthesis based on the database

Research Project

PDF

Project/Area Number	19K12024
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Review Section	Basic Section 61010:Perceptual information processing-related
Research Institution	Tokyo University of Science
Principal Investigator	Katsurada Kouichi 東京理科大学, 理工学部情報科学科, 教授 (80324490)
Co-Investigator(Kenkyū-buntansha)	牧野武彦中央大学, 経済学部, 教授 (00269482) 若宮幸平九州大学, 芸術工学研究院, 助教 (70294999)
Project Period (FY)	2019-04-01 – 2023-03-31
Keywords	EMA / 調音運動 / 音声合成 / rtMRI
Outline of Final Research Achievements	We developed (1) a speech synthesis system from EMA data, (2) a speech synthesis system from rtMRI data, and built (3) an articulatory movement database using EMA. The speech synthesis system from EMA data is constructed for multiple speakers using LSTM and D-vector, and we confirmed it can generate sufficient synthesized sounds, especially for speaker-close synthesis. For speech synthesis from rtMRI data, we used transposed convolution which interpolates time series data, and the results showed the quality improved when the stride size is increased. As for articulatory database, we have completed the recording of articulatory movement data for seven persons, and IPA assignment has been completed for one of them.
Free Research Field	音声情報処理
Academic Significance and Societal Importance of the Research Achievements	本研究によって，舌や唇の動きを表す調音運動から音声が良好に生成できることが確認できた．収録方法の異なる2種類の調音運動データ（EMA，rtMRI）の双方で生成できることを確認しており，当該分野の研究進展に微力ながら貢献できたと考えている．調音運動のデータは一般的に収録が困難ではあるが，本研究で日本語用の調音運動データを収録することによって，音声学や音声情報処理の研究分野において調音運動データを利用することが可能になった．これにより，音声学および音声情報処理の発展に多少なりとも寄与できたと考えている．