Cross-disciplinary approach to prosody-based automatic speech processing and its application to computer-assisted language teaching

Research Project

Project/Area Number	20K00838
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Review Section	Basic Section 02100:Foreign language education-related
Research Institution	The University of Aizu
Principal Investigator	Pyshkin Evgeny 会津大学, コンピュータ理工学部, 上級准教授 (50794088)
Co-Investigator(Kenkyū-buntansha)	Mozgovoy Maxim 会津大学, コンピュータ理工学部, 准教授 (60571776) BLAKE John 会津大学, コンピュータ理工学部, 上級准教授 (80635954)
Project Period (FY)	2020-04-01 – 2023-03-31
Project Status	Completed (Fiscal Year 2022)
Budget Amount *help	¥3,770,000 (Direct Cost: ¥2,900,000、Indirect Cost: ¥870,000) Fiscal Year 2022: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000) Fiscal Year 2021: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000) Fiscal Year 2020: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Keywords	CAPT / prosody / speech visualization / pitch estimation / multimodal feedback / multi-language CAPT / suprasegmentals / proximal developement / CAPT personalization / pitch visualization / L2 education / mobile technology / speech processing / Speech processing / Audio-visual feedback / ASR / Langauge prosody / CALL / mobile
Outline of Research at the Start	2020: Redesigning the digital signal processing (DSP) core for platform independence of components used in CAPT, ASR and phonology research. 2021: Developing the mobile aps using our DSP library based on the stack of modern mobile development technologies. 2022: Evaluation in classrooms situations.
Outline of Final Research Achievements	We completed a study on the potential of CAPT system advancement based on signal and speech recognition and speech processing algorithms and their customization via computer-aided prosody modeling and visualization instruments. We developed the digital signal processing core comprising pitch extraction, voice activity detection, pitch graph interpolation, and pitch estimation, the latter based on using dynamic time warping algorithm. The current implementation supports the transcription and phrasal intonation visualization shown by model and user pitch curves accompanied by a multimodal feedback including DTW-based metrics, extended phonetic transcription, and audial and video output, thus, providing a foundation for further feedback tailoring with evaluative, instructive, and actionable components. The system has been assessed for several languages representing different language groups, thus, creating good ground for further multilingual setup of personalizable CAPT environment.
Academic Significance and Societal Importance of the Research Achievements	The project advances a prosody-based CAPT system using signal and speech processing algorithms for speech visualization and providing a multimodal feedback to learners. Applying the approach to different language groups has a strong impact to improving communication skills of language learners.

Report

(4 results)

2022 Annual Research Report Final Research Report ( PDF )
2021 Research-status Report
2020 Research-status Report

Research Products
(14 results)

All 2022 2021 2020 Other

All Int'l Joint Research (3 results) Journal Article (4 results) (of which Int'l Joint Research: 4 results, Peer Reviewed: 4 results, Open Access: 4 results) Presentation (5 results) (of which Int'l Joint Research: 5 results, Invited: 1 results) Remarks (2 results)

[Int'l Joint Research] St. Petersburg Polytechnic University(ロシア連邦)
- Related Report
  2022 Annual Research Report
[Int'l Joint Research] St. Petersburg Polytechnic University(ロシア連邦)
- Related Report
  2021 Research-status Report
[Int'l Joint Research] St. Petersburg Polytechnic University(ロシア連邦)
- Related Report
  2020 Research-status Report
[Journal Article] Language accent detection with CNN using sparse data from a crowd-sourced speech archive2022
- Author(s)
  V. Mikhailava, M. Lesnichaia, N. Bogach, I. Lezhenin, J. Blake, and E. Pyshkin
- Journal Title
  
  Mathematics
  
  Volume: 10 Issue: 16 Pages: 2913-2913
- DOI
  10.3390/math10162913
- Related Report
  2022 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Adopting StudyIntonation CAPT Tools to Tonal Languages Through the Example of Vietnamese2021
- Author(s)
  N. Nguyen Van, S. Luu Xuan, I. Lezhenin, N. Bogach, and E. Pyshkin
- Journal Title
  
  SHS Web Conf.
  
  Volume: 102 Pages: 01007-01007
- DOI
  10.1051/shsconf/202110201007
- Related Report
  2021 Research-status Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Speech Processing for Language Learning: A Practical Approach to Computer-Assisted Pronunciation Teaching2021
- Author(s)
  N. Bogach, E. Boitsova, S. Chernonog, A. Lamtev, M. Lesnychaya, I. Lezhenin, A. Novopashenny, R. Svechnikov, D. Tsikach, K. Vasiliev, J. Blake, and E. Pyshkin
- Journal Title
  
  Electronics
  
  Volume: 10 (3), 235 Issue: 3 Pages: 1-22
- DOI
  10.3390/electronics10030235
- Related Report
  2020 Research-status Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] A Metaphoric Bridge: Understanding Software Engineering Education through Literature and Fine Arts2020
- Author(s)
  E. Pyshkin and J. Blake
- Journal Title
  
  Society. Communication. Education
  
  Volume: 11 (3) Pages: 59-77
- Related Report
  2020 Research-status Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Presentation] Dynamic assessment during suprasegmental training with mobile CAPT2022
- Author(s)
  V. Mikhailava, J. Blake, E. Pyshkin, N. Bogach, S. Chernonog, A. Zhuikov, M. Lesnichaya, I. Lezhenin, and R. Svechnikov
- Organizer
  11th International Conference on Speech Prosody 2022
- Related Report
  2022 Annual Research Report
- Int'l Joint Research
[Presentation] Classification of accented English using CNN model trained on amplitude mel-spectrograms2022
- Author(s)
  M. Lesnichaia, V. Mikhailova, N. Bogach, I. Lezhenin, J. Blake, and E. Pyshkin
- Organizer
  Interspeech 2022
- Related Report
  2022 Annual Research Report
- Int'l Joint Research
[Presentation] Increasing inclusivity: Catering to the needs of socially inactive learners2022
- Author(s)
  E. Pyshkin and J. Blake
- Organizer
  Diversity and Inclusivity in English Language Education 2023
- Related Report
  2022 Annual Research Report
- Int'l Joint Research
[Presentation] Tailoring Computer-Assisted Pronunciation Teaching: Mixing and Matching the Mode and Manner of Feedback to Learners2022
- Author(s)
  V. Mikhailava, E. Pyshkin, J. Blake, S. Chernonog, I. Lezhenin, R. Svechnikov, and N. Bogach,
- Organizer
  INTED-2022
- Related Report
  2021 Research-status Report
- Int'l Joint Research
[Presentation] “Tailored Fit”: Shaping CAPT Tools Feedback to Language Learners2021
- Author(s)
  E. Pyshkin
- Organizer
  ICSEB-2021
- Related Report
  2021 Research-status Report
- Int'l Joint Research / Invited
[Remarks] English Intonation Training
- URL
  http://studyintonation.org/
- Related Report
  2021 Research-status Report
[Remarks] Study Intonation: English Intonation Training
- URL
  http://studyintonation.org/
- Related Report
  2020 Research-status Report

Cross-disciplinary approach to prosody-based automatic speech processing and its application to computer-assisted language teaching

Principal Investigator

Pyshkin Evgeny 会津大学, コンピュータ理工学部, 上級准教授 (50794088)

¥3,770,000 (Direct Cost: ¥2,900,000、Indirect Cost: ¥870,000)

Report

Research Products

[Int'l Joint Research] St. Petersburg Polytechnic University(ロシア連邦)

Related Report

[Int'l Joint Research] St. Petersburg Polytechnic University(ロシア連邦)

Related Report

[Int'l Joint Research] St. Petersburg Polytechnic University(ロシア連邦)

Related Report

[Journal Article] Language accent detection with CNN using sparse data from a crowd-sourced speech archive2022

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Adopting StudyIntonation CAPT Tools to Tonal Languages Through the Example of Vietnamese2021

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Speech Processing for Language Learning: A Practical Approach to Computer-Assisted Pronunciation Teaching2021

Author(s)

Journal Title

DOI

Related Report

[Journal Article] A Metaphoric Bridge: Understanding Software Engineering Education through Literature and Fine Arts2020

Author(s)

Journal Title

Related Report

[Presentation] Dynamic assessment during suprasegmental training with mobile CAPT2022

Author(s)

Organizer

Related Report

[Presentation] Classification of accented English using CNN model trained on amplitude mel-spectrograms2022

Author(s)

Organizer

Related Report

[Presentation] Increasing inclusivity: Catering to the needs of socially inactive learners2022

Author(s)

Organizer

Related Report

[Presentation] Tailoring Computer-Assisted Pronunciation Teaching: Mixing and Matching the Mode and Manner of Feedback to Learners2022

Author(s)

Organizer

Related Report

[Presentation] “Tailored Fit”: Shaping CAPT Tools Feedback to Language Learners2021

Author(s)

Organizer

Related Report

[Remarks] English Intonation Training

URL

Related Report

[Remarks] Study Intonation: English Intonation Training

URL

Related Report