2020 Fiscal Year Research-status Report

Cross-disciplinary approach to prosody-based automatic speech processing and its application to computer-assisted language teaching

Research Project

Project/Area Number	20K00838
Research Institution	The University of Aizu
Principal Investigator	Pyshkin Evgeny 会津大学, コンピュータ理工学部, 上級准教授 (50794088)
Co-Investigator(Kenkyū-buntansha)	Mozgovoy Maxim 会津大学, コンピュータ理工学部, 准教授 (60571776) BLAKE John 会津大学, コンピュータ理工学部, 准教授 (80635954)
Project Period (FY)	2020-04-01 – 2023-03-31
Keywords	Speech processing / CAPT / Audio-visual feedback / ASR / Langauge prosody
Outline of Annual Research Achievements	We completed a study on the potential of pronunciation teaching with the use of speech processing algorithms and their individualization via computer-aided prosody modeling and visualization instruments. We applied voice activity detection and instrumented our StudyIntonation learning environment with using automated speech recognition algorithms. Having phonemes and their duration and energy, the rhythmic pattern can be retrieved. Transcription and phrasal rhythm are visualized with phrasal intonation shown by pitch curves. We reorganised CAPT courseware to represent each task as a hierarchical phonological structure which contains an intonation curve, a rhythmic pattern and IPA transcription. We started a project on StudyIntonation adoption to the particular case of tonal languages.
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason Early design assessments demonstrate both the high potential of StudyIntonation environment and the improvements required to create a convenient, intuitive and interactive CAPT environment. The usability of CAPT tools increases if they are able to display the features of natural connected speech such as elision, assimilation, deletion, juncture, etc. At word level the following pronunciation aspects can be trained: stress positioning; stressed/unstressed syllables effects, e.g. vowel reduction; tone movement. Respectively, at phrasal level the learners might observe: sentence accent placement; rhythmic pattern production; phrasal intonation movements related to communicative functions. The practical purpose of the StudyIntonation project is twofold: first, to develop and assess a technology-driven language learning environment including a course toolkit with end-user mobile and web-based applications (that we developed); and second, to develop tools for speech annotation and semantic analysis based on intonation patterns and digital signal processing algorithms.
Strategy for Future Research Activity	During assessment, our digital signal processing core allowed inaccuracies in the construction of phonetic transcription of colloquial speech. To the best of our knowledge, the cause of these inaccuracies stems from the ASR model used (e.g. Librispeech), which is trained on audio-books performed by professional actors. One problem commonly faced while implementing a CAPT system is how to establish a relevant and adequate tailored feedback mechanism. First and most important, we need feedback so that both the teacher and the learner are able to identify and evaluate the segmental and suprasegmental errors. Second, we need feedback to evaluate the current progress and to suggest steps for improvement in the system. Third, the teachers are often interested in getting a kind of behavioral feedback from their students including their interests, involvement or engagement. Finally, there are also usability aspects. Although StudyIntonation enables provisioning the feedback in the form of visuals and some numeric scores, there are still open issues in our design such as (1) metric adequacy and sensitivity to phonemic, rhythmic and intonational distortions; (2) feedback limitations when learners are not verbally instructed what to do to improve; (3) rigid interface when the graphs are not interactive; and (4) the effect of context which produces multiple prosodic portraits of the same phrase which are difficult to be displayed simultaneously.
Causes of Carryover	Due to COVID-19 restriction we could not arrange our expenses for travel and workshop organization, that is why they need to be transferred to the next fiscal year with the same usage plan as it was in 2020.

Research Products
(4 results)

All 2021 2020 Other

All Int'l Joint Research (1 results) Journal Article (2 results) (of which Int'l Joint Research: 2 results, Peer Reviewed: 2 results, Open Access: 2 results) Remarks (1 results)

[Int'l Joint Research] St. Petersburg Polytechnic University(ロシア連邦)
- Country Name
  RUSSIA FEDERATION
- Counterpart Institution
  St. Petersburg Polytechnic University
[Journal Article] Speech Processing for Language Learning: A Practical Approach to Computer-Assisted Pronunciation Teaching2021
- Author(s)
  N. Bogach, E. Boitsova, S. Chernonog, A. Lamtev, M. Lesnychaya, I. Lezhenin, A. Novopashenny, R. Svechnikov, D. Tsikach, K. Vasiliev, J. Blake, and E. Pyshkin
- Journal Title
  
  Electronics
  
  Volume: 10 (3), 235 Pages: 1 - 22
- DOI
  10.3390/electronics10030235
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] A Metaphoric Bridge: Understanding Software Engineering Education through Literature and Fine Arts2020
- Author(s)
  E. Pyshkin and J. Blake
- Journal Title
  
  Society. Communication. Education
  
  Volume: 11 (3) Pages: 59 - 77
- DOI
  10.18721/JHSS.11305
- Peer Reviewed / Open Access / Int'l Joint Research
[Remarks] Study Intonation: English Intonation Training
- URL
  http://studyintonation.org/

2020 Fiscal Year Research-status Report

Cross-disciplinary approach to prosody-based automatic speech processing and its application to computer-assisted language teaching

Principal Investigator

Pyshkin Evgeny 会津大学, コンピュータ理工学部, 上級准教授 (50794088)

Current Status of Research Progress

Reason

Research Products

[Int'l Joint Research] St. Petersburg Polytechnic University(ロシア連邦)

Country Name

Counterpart Institution

[Journal Article] Speech Processing for Language Learning: A Practical Approach to Computer-Assisted Pronunciation Teaching2021

Author(s)

Journal Title

DOI

[Journal Article] A Metaphoric Bridge: Understanding Software Engineering Education through Literature and Fine Arts2020

Author(s)

Journal Title

DOI

[Remarks] Study Intonation: English Intonation Training

URL