Project/Area Number |
20K11956
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Review Section |
Basic Section 61030:Intelligent informatics-related
|
Research Institution | The University of Aizu |
Principal Investigator |
Villegas Julian 会津大学, コンピュータ理工学部, 上級准教授 (50706281)
|
Co-Investigator(Kenkyū-buntansha) |
李 勝勲 国際基督教大学, 教養学部, 上級准教授 (20770134)
MARKOV K 会津大学, コンピュータ理工学部, 教授 (80394998)
|
Project Period (FY) |
2020-04-01 – 2023-03-31
|
Project Status |
Completed (Fiscal Year 2022)
|
Budget Amount *help |
¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000)
Fiscal Year 2022: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2021: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2020: ¥2,340,000 (Direct Cost: ¥1,800,000、Indirect Cost: ¥540,000)
|
Keywords | Phonation prediction / Machine Learning / Psychoacoustics / Phonation prediction / Machine Learning / Corpus acquisition / Machine learning |
Outline of Research at the Start |
Many of languages use voice qualities such as modal, breathy, creaky, etc. for distinguishing between units of sound. This project (PSYPHON) aims at the creation of predictors for those voice qualities by using models of sound perception instead of looking at the speech recordings directly.
|
Outline of Final Research Achievements |
We were able to document and acquire new corpora of words from the Zapotec and Mixe languages. These languages are characterized by a three-way phonemic contrast (modal, creaky, and breathy). These corpora are valuable for training phonation prediction systems based on machine learning. By subjective experimentation, we found that the sensitivity to creakiness observed in classifications made by experts and machine learning systems based on these classifications surpassed that of native and naive listeners. This finding supports our hypothesis that psychoacoustic features, which are universal, are better predictors of perceived phonation compared to existing methods. In addition, we found that falsetto was associated with pitch, whispering with sharpness, and creakiness with loudness and roughness. Lastly, by re-analyzing previous subjective studies, we were able to develop a psychoacoustic roughness model based on machine learning techniques.
|
Academic Significance and Societal Importance of the Research Achievements |
This project is significant because it helps to close the gap between languages that are under-resourced and those that have sufficient resources, contributing directly to the Sustainable Development Goal SDG-10 (reducing inequality within and among countries).
|