• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2022 Fiscal Year Final Research Report

Cross-disciplinary approach to prosody-based automatic speech processing and its application to computer-assisted language teaching

Research Project

  • PDF
Project/Area Number 20K00838
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 02100:Foreign language education-related
Research InstitutionThe University of Aizu

Principal Investigator

Pyshkin Evgeny  会津大学, コンピュータ理工学部, 上級准教授 (50794088)

Co-Investigator(Kenkyū-buntansha) Mozgovoy Maxim  会津大学, コンピュータ理工学部, 准教授 (60571776)
BLAKE John  会津大学, コンピュータ理工学部, 上級准教授 (80635954)
Project Period (FY) 2020-04-01 – 2023-03-31
KeywordsCAPT / prosody / speech visualization / pitch estimation / multimodal feedback
Outline of Final Research Achievements

We completed a study on the potential of CAPT system advancement based on signal and speech recognition and speech processing algorithms and their customization via computer-aided prosody modeling and visualization instruments.
We developed the digital signal processing core comprising pitch extraction, voice activity detection, pitch graph interpolation, and pitch estimation, the latter based on using dynamic time warping algorithm.
The current implementation supports the transcription and phrasal intonation visualization shown by model and user pitch curves accompanied by a multimodal feedback including DTW-based metrics, extended phonetic transcription, and audial and video output, thus, providing a foundation for further feedback tailoring with evaluative, instructive, and actionable components. The system has been assessed for several languages representing different language groups, thus, creating good ground for further multilingual setup of personalizable CAPT environment.

Free Research Field

Human-centric software

Academic Significance and Societal Importance of the Research Achievements

The project advances a prosody-based CAPT system using signal and speech processing algorithms for speech visualization and providing a multimodal feedback to learners. Applying the approach to different language groups has a strong impact to improving communication skills of language learners.

URL: 

Published: 2024-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi