Depelopment of high-quality speech analysis-synthesis systems with ability to extract 3D vocal tract shape and vocal cord vibration signal precisely
Project/Area Number |
17K00253
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Research Field |
Perceptual information processing
|
Research Institution | Meijo University |
Principal Investigator |
Banno Hideki 名城大学, 理工学部, 准教授 (20335003)
|
Project Period (FY) |
2017-04-01 – 2021-03-31
|
Project Status |
Completed (Fiscal Year 2020)
|
Budget Amount *help |
¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000)
Fiscal Year 2020: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
Fiscal Year 2019: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
Fiscal Year 2018: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
Fiscal Year 2017: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
|
Keywords | 3次元声道形状 / 声道断面積関数 / PARCOR分析 / フォルマント / FDTD法 / 機械学習 / ケリーの声道モデル / 声帯音源 / PARCOR係数 / ソースフィルタモデル / 音声分析合成 / 音声情報処理 |
Outline of Final Research Achievements |
One of the methods to estimate vocal tract shape information from speech signal is the PARCOR analysis-based method which converts the PARCOR coefficients of speech signal into the vocal tract area function. However, the estimated vocal tract area function sometimes is incorrect and does not always represent complicated shape. Accordingly, we started the study on the method estimating 3-D vocal tract shape precisely from speech signal. Firstly, physical 1-D vocal tract models which correspond to the estimated vocal tract area function were created by 3-D printing, then the acoustic characteristics of the models were measured. Secondly, the characteristics were compared with simulation results generated by acoustic simulation methods such as the FDTD method which can generate simulated acoustic characteristics from shape information. Lastly, based on these comparisons, we improved our method.
|
Academic Significance and Societal Importance of the Research Achievements |
音声信号から発声器官のパラメータを推定する研究や口唇の画像を生成する研究、3次元声道形状から音声を合成する研究は存在しているが、音声信号から3次元声道形状を推定し、さらにそれを用いて高品質に音声を合成する研究は世界的にも類を見ず、極めて独創的な研究である。今回の研究では、詳細な3次元声道形状を推定する部分の実現はできなかったが、今後、言語教育における発声の可視化への応用や、声質変換などの応用における新しい音声補間の方法の開発にもつながるなど、極めて意義深い研究であると考えている。
|
Report
(5 results)
Research Products
(8 results)