2020 年度実施状況報告書

Cross-disciplinary approach to prosody-based automatic speech processing and its application to computer-assisted language teaching

研究課題

研究課題/領域番号	20K00838
研究機関	会津大学
研究代表者	Pyshkin Evgeny 会津大学, コンピュータ理工学部, 上級准教授 (50794088)
研究分担者	Mozgovoy Maxim 会津大学, コンピュータ理工学部, 准教授 (60571776) BLAKE John 会津大学, コンピュータ理工学部, 准教授 (80635954)
研究期間 (年度)	2020-04-01 – 2023-03-31
キーワード	Speech processing / CAPT / Audio-visual feedback / ASR / Langauge prosody
研究実績の概要	We completed a study on the potential of pronunciation teaching with the use of speech processing algorithms and their individualization via computer-aided prosody modeling and visualization instruments. We applied voice activity detection and instrumented our StudyIntonation learning environment with using automated speech recognition algorithms. Having phonemes and their duration and energy, the rhythmic pattern can be retrieved. Transcription and phrasal rhythm are visualized with phrasal intonation shown by pitch curves. We reorganised CAPT courseware to represent each task as a hierarchical phonological structure which contains an intonation curve, a rhythmic pattern and IPA transcription. We started a project on StudyIntonation adoption to the particular case of tonal languages.
現在までの達成度 (区分)	現在までの達成度 (区分) 2: おおむね順調に進展している理由 Early design assessments demonstrate both the high potential of StudyIntonation environment and the improvements required to create a convenient, intuitive and interactive CAPT environment. The usability of CAPT tools increases if they are able to display the features of natural connected speech such as elision, assimilation, deletion, juncture, etc. At word level the following pronunciation aspects can be trained: stress positioning; stressed/unstressed syllables effects, e.g. vowel reduction; tone movement. Respectively, at phrasal level the learners might observe: sentence accent placement; rhythmic pattern production; phrasal intonation movements related to communicative functions. The practical purpose of the StudyIntonation project is twofold: first, to develop and assess a technology-driven language learning environment including a course toolkit with end-user mobile and web-based applications (that we developed); and second, to develop tools for speech annotation and semantic analysis based on intonation patterns and digital signal processing algorithms.
今後の研究の推進方策	During assessment, our digital signal processing core allowed inaccuracies in the construction of phonetic transcription of colloquial speech. To the best of our knowledge, the cause of these inaccuracies stems from the ASR model used (e.g. Librispeech), which is trained on audio-books performed by professional actors. One problem commonly faced while implementing a CAPT system is how to establish a relevant and adequate tailored feedback mechanism. First and most important, we need feedback so that both the teacher and the learner are able to identify and evaluate the segmental and suprasegmental errors. Second, we need feedback to evaluate the current progress and to suggest steps for improvement in the system. Third, the teachers are often interested in getting a kind of behavioral feedback from their students including their interests, involvement or engagement. Finally, there are also usability aspects. Although StudyIntonation enables provisioning the feedback in the form of visuals and some numeric scores, there are still open issues in our design such as (1) metric adequacy and sensitivity to phonemic, rhythmic and intonational distortions; (2) feedback limitations when learners are not verbally instructed what to do to improve; (3) rigid interface when the graphs are not interactive; and (4) the effect of context which produces multiple prosodic portraits of the same phrase which are difficult to be displayed simultaneously.
次年度使用額が生じた理由	Due to COVID-19 restriction we could not arrange our expenses for travel and workshop organization, that is why they need to be transferred to the next fiscal year with the same usage plan as it was in 2020.

研究成果
(4件)

すべて 2021 2020 その他

すべて国際共同研究 (1件) 雑誌論文 (2件) (うち国際共著 2件、査読あり 2件、オープンアクセス 2件) 備考 (1件)

[国際共同研究] St. Petersburg Polytechnic University(ロシア連邦)
- 国名
  ロシア連邦
- 外国機関名
  St. Petersburg Polytechnic University
[雑誌論文] Speech Processing for Language Learning: A Practical Approach to Computer-Assisted Pronunciation Teaching2021
- 著者名/発表者名
  N. Bogach, E. Boitsova, S. Chernonog, A. Lamtev, M. Lesnychaya, I. Lezhenin, A. Novopashenny, R. Svechnikov, D. Tsikach, K. Vasiliev, J. Blake, and E. Pyshkin
- 雑誌名
  
  Electronics
  
  巻: 10 (3), 235 ページ: 1 - 22
- DOI
  10.3390/electronics10030235
- 査読あり / オープンアクセス / 国際共著
[雑誌論文] A Metaphoric Bridge: Understanding Software Engineering Education through Literature and Fine Arts2020
- 著者名/発表者名
  E. Pyshkin and J. Blake
- 雑誌名
  
  Society. Communication. Education
  
  巻: 11 (3) ページ: 59 - 77
- DOI
  10.18721/JHSS.11305
- 査読あり / オープンアクセス / 国際共著
[備考] Study Intonation: English Intonation Training
- URL
  http://studyintonation.org/

2020 年度 実施状況報告書

Cross-disciplinary approach to prosody-based automatic speech processing and its application to computer-assisted language teaching

研究代表者

Pyshkin Evgeny 会津大学, コンピュータ理工学部, 上級准教授 (50794088)

現在までの達成度 (区分)

理由

研究成果

[国際共同研究] St. Petersburg Polytechnic University(ロシア連邦)

国名

外国機関名

[雑誌論文] Speech Processing for Language Learning: A Practical Approach to Computer-Assisted Pronunciation Teaching2021

著者名/発表者名

雑誌名

DOI

[雑誌論文] A Metaphoric Bridge: Understanding Software Engineering Education through Literature and Fine Arts2020

著者名/発表者名

雑誌名

DOI

[備考] Study Intonation: English Intonation Training

URL

2020 年度実施状況報告書