• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Cross-disciplinary approach to prosody-based automatic speech processing and its application to computer-assisted language teaching

Research Project

Project/Area Number 20K00838
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 02100:Foreign language education-related
Research InstitutionThe University of Aizu

Principal Investigator

Pyshkin Evgeny  会津大学, コンピュータ理工学部, 上級准教授 (50794088)

Co-Investigator(Kenkyū-buntansha) Mozgovoy Maxim  会津大学, コンピュータ理工学部, 准教授 (60571776)
BLAKE John  会津大学, コンピュータ理工学部, 上級准教授 (80635954)
Project Period (FY) 2020-04-01 – 2023-03-31
Project Status Completed (Fiscal Year 2022)
Budget Amount *help
¥3,770,000 (Direct Cost: ¥2,900,000、Indirect Cost: ¥870,000)
Fiscal Year 2022: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2021: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2020: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
KeywordsCAPT / prosody / speech visualization / pitch estimation / multimodal feedback / multi-language CAPT / suprasegmentals / proximal developement / CAPT personalization / pitch visualization / L2 education / mobile technology / speech processing / Speech processing / Audio-visual feedback / ASR / Langauge prosody / CALL / mobile
Outline of Research at the Start

2020: Redesigning the digital signal processing (DSP) core for platform independence of components used in CAPT, ASR and phonology research.
2021: Developing the mobile aps using our DSP library based on the stack of modern mobile development technologies.
2022: Evaluation in classrooms situations.

Outline of Final Research Achievements

We completed a study on the potential of CAPT system advancement based on signal and speech recognition and speech processing algorithms and their customization via computer-aided prosody modeling and visualization instruments.
We developed the digital signal processing core comprising pitch extraction, voice activity detection, pitch graph interpolation, and pitch estimation, the latter based on using dynamic time warping algorithm.
The current implementation supports the transcription and phrasal intonation visualization shown by model and user pitch curves accompanied by a multimodal feedback including DTW-based metrics, extended phonetic transcription, and audial and video output, thus, providing a foundation for further feedback tailoring with evaluative, instructive, and actionable components. The system has been assessed for several languages representing different language groups, thus, creating good ground for further multilingual setup of personalizable CAPT environment.

Academic Significance and Societal Importance of the Research Achievements

The project advances a prosody-based CAPT system using signal and speech processing algorithms for speech visualization and providing a multimodal feedback to learners. Applying the approach to different language groups has a strong impact to improving communication skills of language learners.

Report

(4 results)
  • 2022 Annual Research Report   Final Research Report ( PDF )
  • 2021 Research-status Report
  • 2020 Research-status Report
  • Research Products

    (14 results)

All 2022 2021 2020 Other

All Int'l Joint Research (3 results) Journal Article (4 results) (of which Int'l Joint Research: 4 results,  Peer Reviewed: 4 results,  Open Access: 4 results) Presentation (5 results) (of which Int'l Joint Research: 5 results,  Invited: 1 results) Remarks (2 results)

  • [Int'l Joint Research] St. Petersburg Polytechnic University(ロシア連邦)

    • Related Report
      2022 Annual Research Report
  • [Int'l Joint Research] St. Petersburg Polytechnic University(ロシア連邦)

    • Related Report
      2021 Research-status Report
  • [Int'l Joint Research] St. Petersburg Polytechnic University(ロシア連邦)

    • Related Report
      2020 Research-status Report
  • [Journal Article] Language accent detection with CNN using sparse data from a crowd-sourced speech archive2022

    • Author(s)
      V. Mikhailava, M. Lesnichaia, N. Bogach, I. Lezhenin, J. Blake, and E. Pyshkin
    • Journal Title

      Mathematics

      Volume: 10 Issue: 16 Pages: 2913-2913

    • DOI

      10.3390/math10162913

    • Related Report
      2022 Annual Research Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Adopting StudyIntonation CAPT Tools to Tonal Languages Through the Example of Vietnamese2021

    • Author(s)
      N. Nguyen Van, S. Luu Xuan, I. Lezhenin, N. Bogach, and E. Pyshkin
    • Journal Title

      SHS Web Conf.

      Volume: 102 Pages: 01007-01007

    • DOI

      10.1051/shsconf/202110201007

    • Related Report
      2021 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Speech Processing for Language Learning: A Practical Approach to Computer-Assisted Pronunciation Teaching2021

    • Author(s)
      N. Bogach, E. Boitsova, S. Chernonog, A. Lamtev, M. Lesnychaya, I. Lezhenin, A. Novopashenny, R. Svechnikov, D. Tsikach, K. Vasiliev, J. Blake, and E. Pyshkin
    • Journal Title

      Electronics

      Volume: 10 (3), 235 Issue: 3 Pages: 1-22

    • DOI

      10.3390/electronics10030235

    • Related Report
      2020 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] A Metaphoric Bridge: Understanding Software Engineering Education through Literature and Fine Arts2020

    • Author(s)
      E. Pyshkin and J. Blake
    • Journal Title

      Society. Communication. Education

      Volume: 11 (3) Pages: 59-77

    • Related Report
      2020 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Presentation] Dynamic assessment during suprasegmental training with mobile CAPT2022

    • Author(s)
      V. Mikhailava, J. Blake, E. Pyshkin, N. Bogach, S. Chernonog, A. Zhuikov, M. Lesnichaya, I. Lezhenin, and R. Svechnikov
    • Organizer
      11th International Conference on Speech Prosody 2022
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Classification of accented English using CNN model trained on amplitude mel-spectrograms2022

    • Author(s)
      M. Lesnichaia, V. Mikhailova, N. Bogach, I. Lezhenin, J. Blake, and E. Pyshkin
    • Organizer
      Interspeech 2022
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Increasing inclusivity: Catering to the needs of socially inactive learners2022

    • Author(s)
      E. Pyshkin and J. Blake
    • Organizer
      Diversity and Inclusivity in English Language Education 2023
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Tailoring Computer-Assisted Pronunciation Teaching: Mixing and Matching the Mode and Manner of Feedback to Learners2022

    • Author(s)
      V. Mikhailava, E. Pyshkin, J. Blake, S. Chernonog, I. Lezhenin, R. Svechnikov, and N. Bogach,
    • Organizer
      INTED-2022
    • Related Report
      2021 Research-status Report
    • Int'l Joint Research
  • [Presentation] “Tailored Fit”: Shaping CAPT Tools Feedback to Language Learners2021

    • Author(s)
      E. Pyshkin
    • Organizer
      ICSEB-2021
    • Related Report
      2021 Research-status Report
    • Int'l Joint Research / Invited
  • [Remarks] English Intonation Training

    • URL

      http://studyintonation.org/

    • Related Report
      2021 Research-status Report
  • [Remarks] Study Intonation: English Intonation Training

    • URL

      http://studyintonation.org/

    • Related Report
      2020 Research-status Report

URL: 

Published: 2020-04-28   Modified: 2024-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi