2020 Fiscal Year Final Research Report
Depelopment of high-quality speech analysis-synthesis systems with ability to extract 3D vocal tract shape and vocal cord vibration signal precisely
Project/Area Number |
17K00253
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Research Field |
Perceptual information processing
|
Research Institution | Meijo University |
Principal Investigator |
Banno Hideki 名城大学, 理工学部, 准教授 (20335003)
|
Project Period (FY) |
2017-04-01 – 2021-03-31
|
Keywords | 3次元声道形状 / 声道断面積関数 / PARCOR分析 / フォルマント / FDTD法 |
Outline of Final Research Achievements |
One of the methods to estimate vocal tract shape information from speech signal is the PARCOR analysis-based method which converts the PARCOR coefficients of speech signal into the vocal tract area function. However, the estimated vocal tract area function sometimes is incorrect and does not always represent complicated shape. Accordingly, we started the study on the method estimating 3-D vocal tract shape precisely from speech signal. Firstly, physical 1-D vocal tract models which correspond to the estimated vocal tract area function were created by 3-D printing, then the acoustic characteristics of the models were measured. Secondly, the characteristics were compared with simulation results generated by acoustic simulation methods such as the FDTD method which can generate simulated acoustic characteristics from shape information. Lastly, based on these comparisons, we improved our method.
|
Free Research Field |
音情報処理
|
Academic Significance and Societal Importance of the Research Achievements |
音声信号から発声器官のパラメータを推定する研究や口唇の画像を生成する研究、3次元声道形状から音声を合成する研究は存在しているが、音声信号から3次元声道形状を推定し、さらにそれを用いて高品質に音声を合成する研究は世界的にも類を見ず、極めて独創的な研究である。今回の研究では、詳細な3次元声道形状を推定する部分の実現はできなかったが、今後、言語教育における発声の可視化への応用や、声質変換などの応用における新しい音声補間の方法の開発にもつながるなど、極めて意義深い研究であると考えている。
|