2022 Fiscal Year Final Research Report
Construction of articulatory movement database, normalization of databases, and speech synthesis based on the database
Project/Area Number |
19K12024
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Review Section |
Basic Section 61010:Perceptual information processing-related
|
Research Institution | Tokyo University of Science |
Principal Investigator |
|
Co-Investigator(Kenkyū-buntansha) |
牧野 武彦 中央大学, 経済学部, 教授 (00269482)
若宮 幸平 九州大学, 芸術工学研究院, 助教 (70294999)
|
Project Period (FY) |
2019-04-01 – 2023-03-31
|
Keywords | EMA / 調音運動 / 音声合成 / rtMRI |
Outline of Final Research Achievements |
We developed (1) a speech synthesis system from EMA data, (2) a speech synthesis system from rtMRI data, and built (3) an articulatory movement database using EMA. The speech synthesis system from EMA data is constructed for multiple speakers using LSTM and D-vector, and we confirmed it can generate sufficient synthesized sounds, especially for speaker-close synthesis. For speech synthesis from rtMRI data, we used transposed convolution which interpolates time series data, and the results showed the quality improved when the stride size is increased. As for articulatory database, we have completed the recording of articulatory movement data for seven persons, and IPA assignment has been completed for one of them.
|
Free Research Field |
音声情報処理
|
Academic Significance and Societal Importance of the Research Achievements |
本研究によって,舌や唇の動きを表す調音運動から音声が良好に生成できることが確認できた.収録方法の異なる2種類の調音運動データ(EMA,rtMRI)の双方で生成できることを確認しており,当該分野の研究進展に微力ながら貢献できたと考えている.調音運動のデータは一般的に収録が困難ではあるが,本研究で日本語用の調音運動データを収録することによって,音声学や音声情報処理の研究分野において調音運動データを利用することが可能になった.これにより,音声学および音声情報処理の発展に多少なりとも寄与できたと考えている.
|