研究課題/領域番号 |
20K00838
|
研究機関 | 会津大学 |
研究代表者 |
Pyshkin Evgeny 会津大学, コンピュータ理工学部, 上級准教授 (50794088)
|
研究分担者 |
Mozgovoy Maxim 会津大学, コンピュータ理工学部, 准教授 (60571776)
BLAKE John 会津大学, コンピュータ理工学部, 准教授 (80635954)
|
研究期間 (年度) |
2020-04-01 – 2023-03-31
|
キーワード | Speech processing / CAPT / Audio-visual feedback / ASR / Langauge prosody |
研究実績の概要 |
We completed a study on the potential of pronunciation teaching with the use of speech processing algorithms and their individualization via computer-aided prosody modeling and visualization instruments. We applied voice activity detection and instrumented our StudyIntonation learning environment with using automated speech recognition algorithms. Having phonemes and their duration and energy, the rhythmic pattern can be retrieved. Transcription and phrasal rhythm are visualized with phrasal intonation shown by pitch curves. We reorganised CAPT courseware to represent each task as a hierarchical phonological structure which contains an intonation curve, a rhythmic pattern and IPA transcription. We started a project on StudyIntonation adoption to the particular case of tonal languages.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
2: おおむね順調に進展している
理由
Early design assessments demonstrate both the high potential of StudyIntonation environment and the improvements required to create a convenient, intuitive and interactive CAPT environment. The usability of CAPT tools increases if they are able to display the features of natural connected speech such as elision, assimilation, deletion, juncture, etc. At word level the following pronunciation aspects can be trained: stress positioning; stressed/unstressed syllables effects, e.g. vowel reduction; tone movement. Respectively, at phrasal level the learners might observe: sentence accent placement; rhythmic pattern production; phrasal intonation movements related to communicative functions. The practical purpose of the StudyIntonation project is twofold: first, to develop and assess a technology-driven language learning environment including a course toolkit with end-user mobile and web-based applications (that we developed); and second, to develop tools for speech annotation and semantic analysis based on intonation patterns and digital signal processing algorithms.
|
今後の研究の推進方策 |
During assessment, our digital signal processing core allowed inaccuracies in the construction of phonetic transcription of colloquial speech. To the best of our knowledge, the cause of these inaccuracies stems from the ASR model used (e.g. Librispeech), which is trained on audio-books performed by professional actors. One problem commonly faced while implementing a CAPT system is how to establish a relevant and adequate tailored feedback mechanism. First and most important, we need feedback so that both the teacher and the learner are able to identify and evaluate the segmental and suprasegmental errors. Second, we need feedback to evaluate the current progress and to suggest steps for improvement in the system. Third, the teachers are often interested in getting a kind of behavioral feedback from their students including their interests, involvement or engagement. Finally, there are also usability aspects. Although StudyIntonation enables provisioning the feedback in the form of visuals and some numeric scores, there are still open issues in our design such as (1) metric adequacy and sensitivity to phonemic, rhythmic and intonational distortions; (2) feedback limitations when learners are not verbally instructed what to do to improve; (3) rigid interface when the graphs are not interactive; and (4) the effect of context which produces multiple prosodic portraits of the same phrase which are difficult to be displayed simultaneously.
|
次年度使用額が生じた理由 |
Due to COVID-19 restriction we could not arrange our expenses for travel and workshop organization, that is why they need to be transferred to the next fiscal year with the same usage plan as it was in 2020.
|