Cross-disciplinary approach to prosody-based automatic speech processing and its application to computer-assisted language teaching
Project/Area Number |
20K00838
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Review Section |
Basic Section 02100:Foreign language education-related
|
Research Institution | The University of Aizu |
Principal Investigator |
Pyshkin Evgeny 会津大学, コンピュータ理工学部, 上級准教授 (50794088)
|
Co-Investigator(Kenkyū-buntansha) |
Mozgovoy Maxim 会津大学, コンピュータ理工学部, 准教授 (60571776)
BLAKE John 会津大学, コンピュータ理工学部, 上級准教授 (80635954)
|
Project Period (FY) |
2020-04-01 – 2023-03-31
|
Project Status |
Completed (Fiscal Year 2022)
|
Budget Amount *help |
¥3,770,000 (Direct Cost: ¥2,900,000、Indirect Cost: ¥870,000)
Fiscal Year 2022: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2021: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2020: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
|
Keywords | CAPT / prosody / speech visualization / pitch estimation / multimodal feedback / multi-language CAPT / suprasegmentals / proximal developement / CAPT personalization / pitch visualization / L2 education / mobile technology / speech processing / Speech processing / Audio-visual feedback / ASR / Langauge prosody / CALL / mobile |
Outline of Research at the Start |
2020: Redesigning the digital signal processing (DSP) core for platform independence of components used in CAPT, ASR and phonology research. 2021: Developing the mobile aps using our DSP library based on the stack of modern mobile development technologies. 2022: Evaluation in classrooms situations.
|
Outline of Final Research Achievements |
We completed a study on the potential of CAPT system advancement based on signal and speech recognition and speech processing algorithms and their customization via computer-aided prosody modeling and visualization instruments. We developed the digital signal processing core comprising pitch extraction, voice activity detection, pitch graph interpolation, and pitch estimation, the latter based on using dynamic time warping algorithm. The current implementation supports the transcription and phrasal intonation visualization shown by model and user pitch curves accompanied by a multimodal feedback including DTW-based metrics, extended phonetic transcription, and audial and video output, thus, providing a foundation for further feedback tailoring with evaluative, instructive, and actionable components. The system has been assessed for several languages representing different language groups, thus, creating good ground for further multilingual setup of personalizable CAPT environment.
|
Academic Significance and Societal Importance of the Research Achievements |
The project advances a prosody-based CAPT system using signal and speech processing algorithms for speech visualization and providing a multimodal feedback to learners. Applying the approach to different language groups has a strong impact to improving communication skills of language learners.
|
Report
(4 results)
Research Products
(14 results)
-
-
-
-
-
-
[Journal Article] Speech Processing for Language Learning: A Practical Approach to Computer-Assisted Pronunciation Teaching2021
Author(s)
N. Bogach, E. Boitsova, S. Chernonog, A. Lamtev, M. Lesnychaya, I. Lezhenin, A. Novopashenny, R. Svechnikov, D. Tsikach, K. Vasiliev, J. Blake, and E. Pyshkin
-
Journal Title
Electronics
Volume: 10 (3), 235
Issue: 3
Pages: 1-22
DOI
Related Report
Peer Reviewed / Open Access / Int'l Joint Research
-
-
-
-
-
-
-
-