Computer-Assisted Pronunciation Learning System using Speech Recognition Techniaues

Research Project

Project/Area Number	11558037
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	展開研究
Research Field	Intelligent informatics
Research Institution	KYOTO UNIVERSITY
Principal Investigator	KAWAHARA Tatsuya Kyoto University, Graduate School of Informatics, Associate Professor, 情報学研究科, 助教授 (00234104)
Co-Investigator(Kenkyū-buntansha)	KATAGIRI Shigeru NTT Communication Science Laboratories, Executive Manager, コミュニケーション科学基礎研究所, 研究部長 DOSHITA Shuji Ryukoku University, Faculty of Science and Technology, Professor, 理工学部, 教授 (00025925) DANTSUJI Masatake Kyoto University, Center for Information and Multimedia Studies, Professor, 総合情報メディアセンター, 教授 (10188469) SHIMIZU Masaaki Kyoto University, Center for Information and Multimedia Studies, Assistant Professor, 総合情報メディアセンター, 助手 (10314262) OKUNO Hiroshi Kyoto University, Graduate School of Informatics, Professor, 情報学研究科, 教授 (60318201) 中川聖一豊橋技術科学大学, 工学部, 教授 (20115893) 池田克夫京都大学, 情報学研究科, 教授 (30026009)
Project Period (FY)	1999 – 2001
Project Status	Completed (Fiscal Year 2001)
Budget Amount *help	¥7,900,000 (Direct Cost: ¥7,900,000) Fiscal Year 2001: ¥1,600,000 (Direct Cost: ¥1,600,000) Fiscal Year 2000: ¥2,100,000 (Direct Cost: ¥2,100,000) Fiscal Year 1999: ¥4,200,000 (Direct Cost: ¥4,200,000)
Keywords	speech processing / language learning / CALL / speech recognition / phonology / prosody / リズム / 調音
Research Abstract	A Computer-Assisted Language Learning (CALL) system focusing pronunciation training is studied for English learning by Japanese students. First, we model typical English pronunciation errors of Japanese learners and design a system that detects pronunciation errors and generates -effective instruction utilizing speech recognition technologies. For a given training text, a network of error candidates is generated for speech recognition to align the utterance and detect errors. Then, a segment-input pair-wise classifier is applied forverification. This method realizes reliable errordetectionandeffective instruction based on articulatory information. Then, we develop a computer-assisted English prosody learning system. Learners' pronunciation is evaluated by automatic detection of sentence stressed syllables and foot durations. Syllable HMMs are categorized based on error patterns of stress. We also propose a method of multi-stage discrimination that reflects native speakers' perception. Furthermore, foot templates are constructed from native speech database in order to evaluate stress-timing. Finally, we study to estimate non-native speakers' intelligibility and to determine which pronunciation errors affect intelligibility the most. A preliminary study showed that error rates computed by a speech recognition-based system can be used to characterize intelligibility. We use the error rate distributions to assess the student's intelligibility and compute a priority function to find which areas of study are most likely to improve the intelligibility.

Report

(4 results)

2001 Annual Research Report Final Research Report Summary
2000 Annual Research Report
1999 Annual Research Report

Research Products
(35 results)

All Other

All Publications (35 results)

[Publications] C.-H.Jo: "Japanese pronunciation instruction system using speech recognition methods"IEICE Trans.. E83-D,11. 1960-1968 (2000)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] Y.Tsubota: "Computer-assisted english vowel learning system for Japanese speakers using cross language formant structures"Proc. Int'l Conf. Spoken Language Processing (ICSLP). 3. 56-569 (2000)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] K.Imoto: "Modelling of the perception of english sentence stress for computer-assisted language learning"Proc. Int'l Conf. Spoken Language Processing (ICSLP). 3. 175-178 (2000)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] A.Raux: "Optimizing computer-assisted pronunciation instruction by selecting relevant training topics"InSTIL 2002 Advanced Workshop. (2002)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] Y.Tsubota: "CALL system for Japanese students of English using formant structure estimation and pronunciation error prediction"InSTIL 2002 Advanced Workshop. (2002)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] 河原達也: "日本語ディクテーション基本ソフトウェア(99年度版)"日本音響学会誌. 57・3. 210-214 (2001)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] 鹿野清宏: "音声認識システム"オーム社. 200 (2001)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] C.-H.Jo, T.Kawahara, S.Doshita, and M.Dantsuji: "Japanese pronunciation instruction system using speech recognition methods"IEICE Trans.. Vol.E83-D, No.11. 1960-1968 (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] A. Raux and T. Kawahara: "Optimizing computer-assisted pronunciation instruction by selecting relevant training topics"InSTIL 2002 Advanced Workshop. (2002)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] Y.Tsubota, T.Kawahara, and M.Dantsuji: "CALL system for Japanese students of English using formant structure estimation and pronunciation error prediction"InSTIL 2002 Advanced Workshop. (2002)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] Y.Tsubota, M.Dantsuji, and T.Kawahara.: "Computer-assisted English vowel learning system for Japanese speakers using cross language formant structures"Proc. ICSLP. Vol.3. 566-569 (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] K. Imoto, M.Dantsuji, and T.Kawahara.: "Modelling of the perception of English sentence stress for computer-assisted language learning"Proc. ICSLP.. Vol.3. 175-178 (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] C.-H. Jo, T.Kawahara, and S.Doshita.: "The use of duration similarity templates in speech rhythm training"Proc. IEEE Region 10 Conference (TENCON). 146-149 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] C.-H. Jo, T.Kawahara, and S.Doshita.: "Mora-timed speech rhythm training system using rhythm pattern templates"Proc. Int'l Conf. On Speech Processing. 129-134 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] Y.Tsubota: "CALL system for Japanese students of English using formant structure estimation and pronunciation error prediction"InSTIL 2002 Advanced Workshop. (2002)
- Related Report
  2001 Annual Research Report
[Publications] A.Raux: "Optimizing computer-assisted pronunciation instruction by selecting relevant training topics"InSTIL 2002 Advanced Workshop. (2002)
- Related Report
  2001 Annual Research Report
[Publications] T.Kawahara: "Automatic transcription of spontaneous lecture speech"IEEE workshop Automatic Speech Recognition and Understanding. (2001)
- Related Report
  2001 Annual Research Report
[Publications] 井本和範: "文強勢と等時性の自動検出に基づく英語韻律学習支援システム"電子情報通信学会技術研究報告. SP2001-133. (2002)
- Related Report
  2001 Annual Research Report
[Publications] A.Raux: "Intelligibility assessment and pronunciation error diagnosis for a CALL system"電子情報通信学会技術研究報告. SP2001-134. (2002)
- Related Report
  2001 Annual Research Report
[Publications] 河原達也: "連続音声認識コンソーシアム2000年度版ソフトウェアの概要と評価"情報処理学会研究報告. SLP-38-6. (2001)
- Related Report
  2001 Annual Research Report
[Publications] 鹿野清宏: "音声認識システム"オーム社. 200 (2001)
- Related Report
  2001 Annual Research Report
[Publications] C.-H.Jo: "Japanese Pronunciation instruction system using speech recognition methods"IEICE Trans.. E83-D,11. 1960-1968 (2000)
- Related Report
  2000 Annual Research Report
[Publications] 河原達也: "日本語ディクテーション基本ソフトウェア(99年度版)"日本音響学会誌. 57,3. 210-214 (2001)
- Related Report
  2000 Annual Research Report
[Publications] 李晃伸: "Phonetic Tied-Mixtureモデルを用いた大語彙連続音声認識"電子情報通信学会論文誌. J83-DII,12. 2517-2525 (2000)
- Related Report
  2000 Annual Research Report
[Publications] Y.Tsubota: "Computer-assisted english vowel learning system for Japanese speakers using cross language formant structures"Proc.Int'l Conf.Spoken Language Processing(ICSLP). 3. 566-569 (2000)
- Related Report
  2000 Annual Research Report
[Publications] K.Imoto: "Modelling of the perception of english sentence stress for computer-assisted language learning"Proc.Int'l Conf.Spoken Language Processing (ICSLP). 3. 175-178 (2000)
- Related Report
  2000 Annual Research Report
[Publications] 井本和範: "CALLシステムのための英語文強勢知覚のモデル化"電子情報通信学会技術研究報告. SP2000-1. (2000)
- Related Report
  2000 Annual Research Report
[Publications] 鹿野清宏: "音声認識システム"オーム社. (2001)
- Related Report
  2000 Annual Research Report
[Publications] T.Kawahara: "Japanese dictation toolkit - 1997 version-"J. Acoust. Soc. Japan (E). 20,3. 233-239 (1999)
- Related Report
  1999 Annual Research Report
[Publications] 河原達也: "発話検証に基づく音声操作プロジェクタとそれによる講演の自動ハイパーテキスト化"情報処理学会論文誌. 40,4. 1491-1498 (1999)
- Related Report
  1999 Annual Research Report
[Publications] 河原達也: "日本語ディクテーション基本ソフトウェア (98年度版)"日本音響学会誌. 56,4. (2000)
- Related Report
  1999 Annual Research Report
[Publications] C.-H.Jo: "Mora-timed speech rhythm training system using rhythm pattern templates"Proc. Int'l Conf on Speech Processing. 129-134 (1999)
- Related Report
  1999 Annual Research Report
[Publications] C.-H.Jo: "The use of duration similarity templates in speech rhythm training"Proc. IEEE Region 10 Conference. 146-149 (1999)
- Related Report
  1999 Annual Research Report
[Publications] 坪田康: "フォルマント構造推定による日本人用英語発音教示システム"情報処理学会研究報告. SLP-27-12. (1999)
- Related Report
  1999 Annual Research Report
[Publications] 佐藤誠: "人工現実感の設計 ―究極のインターフェイスを求めて―"培風館. (2000)
- Related Report
  1999 Annual Research Report

Computer-Assisted Pronunciation Learning System using Speech Recognition Techniaues

Principal Investigator

KAWAHARA Tatsuya Kyoto University, Graduate School of Informatics, Associate Professor, 情報学研究科, 助教授 (00234104)

¥7,900,000 (Direct Cost: ¥7,900,000)

Report

Research Products

[Publications] C.-H.Jo: "Japanese pronunciation instruction system using speech recognition methods"IEICE Trans.. E83-D,11. 1960-1968 (2000)

Description

Related Report

[Publications] Y.Tsubota: "Computer-assisted english vowel learning system for Japanese speakers using cross language formant structures"Proc. Int'l Conf. Spoken Language Processing (ICSLP). 3. 56-569 (2000)

Description

Related Report

[Publications] K.Imoto: "Modelling of the perception of english sentence stress for computer-assisted language learning"Proc. Int'l Conf. Spoken Language Processing (ICSLP). 3. 175-178 (2000)

Description

Related Report

[Publications] A.Raux: "Optimizing computer-assisted pronunciation instruction by selecting relevant training topics"InSTIL 2002 Advanced Workshop. (2002)

Description

Related Report

[Publications] Y.Tsubota: "CALL system for Japanese students of English using formant structure estimation and pronunciation error prediction"InSTIL 2002 Advanced Workshop. (2002)

Description

Related Report

[Publications] 河原達也: "日本語ディクテーション基本ソフトウェア(99年度版)"日本音響学会誌. 57・3. 210-214 (2001)

Description

Related Report

[Publications] 鹿野清宏: "音声認識システム"オーム社. 200 (2001)

Description

Related Report

[Publications] C.-H.Jo, T.Kawahara, S.Doshita, and M.Dantsuji: "Japanese pronunciation instruction system using speech recognition methods"IEICE Trans.. Vol.E83-D, No.11. 1960-1968 (2000)

Description

Related Report

[Publications] A. Raux and T. Kawahara: "Optimizing computer-assisted pronunciation instruction by selecting relevant training topics"InSTIL 2002 Advanced Workshop. (2002)

Description

Related Report

[Publications] Y.Tsubota, T.Kawahara, and M.Dantsuji: "CALL system for Japanese students of English using formant structure estimation and pronunciation error prediction"InSTIL 2002 Advanced Workshop. (2002)

Description

Related Report

[Publications] Y.Tsubota, M.Dantsuji, and T.Kawahara.: "Computer-assisted English vowel learning system for Japanese speakers using cross language formant structures"Proc. ICSLP. Vol.3. 566-569 (2000)

Description

Related Report

[Publications] K. Imoto, M.Dantsuji, and T.Kawahara.: "Modelling of the perception of English sentence stress for computer-assisted language learning"Proc. ICSLP.. Vol.3. 175-178 (2000)

Description

Related Report

[Publications] C.-H. Jo, T.Kawahara, and S.Doshita.: "The use of duration similarity templates in speech rhythm training"Proc. IEEE Region 10 Conference (TENCON). 146-149 (1999)

Description

Related Report

[Publications] C.-H. Jo, T.Kawahara, and S.Doshita.: "Mora-timed speech rhythm training system using rhythm pattern templates"Proc. Int'l Conf. On Speech Processing. 129-134 (1999)

Description

Related Report

[Publications] Y.Tsubota: "CALL system for Japanese students of English using formant structure estimation and pronunciation error prediction"InSTIL 2002 Advanced Workshop. (2002)

Related Report

[Publications] A.Raux: "Optimizing computer-assisted pronunciation instruction by selecting relevant training topics"InSTIL 2002 Advanced Workshop. (2002)

Related Report

[Publications] T.Kawahara: "Automatic transcription of spontaneous lecture speech"IEEE workshop Automatic Speech Recognition and Understanding. (2001)

Related Report

[Publications] 井本和範: "文強勢と等時性の自動検出に基づく英語韻律学習支援システム"電子情報通信学会技術研究報告. SP2001-133. (2002)

Related Report

[Publications] A.Raux: "Intelligibility assessment and pronunciation error diagnosis for a CALL system"電子情報通信学会技術研究報告. SP2001-134. (2002)

Related Report

[Publications] 河原達也: "連続音声認識コンソーシアム2000年度版ソフトウェアの概要と評価"情報処理学会研究報告. SLP-38-6. (2001)

Related Report

[Publications] 鹿野清宏: "音声認識システム"オーム社. 200 (2001)

Related Report

[Publications] C.-H.Jo: "Japanese Pronunciation instruction system using speech recognition methods"IEICE Trans.. E83-D,11. 1960-1968 (2000)

Related Report

[Publications] 河原達也: "日本語ディクテーション基本ソフトウェア(99年度版)"日本音響学会誌. 57,3. 210-214 (2001)

Related Report

[Publications] 李晃伸: "Phonetic Tied-Mixtureモデルを用いた大語彙連続音声認識"電子情報通信学会論文誌. J83-DII,12. 2517-2525 (2000)

Related Report

[Publications] Y.Tsubota: "Computer-assisted english vowel learning system for Japanese speakers using cross language formant structures"Proc.Int'l Conf.Spoken Language Processing(ICSLP). 3. 566-569 (2000)

Related Report

[Publications] K.Imoto: "Modelling of the perception of english sentence stress for computer-assisted language learning"Proc.Int'l Conf.Spoken Language Processing (ICSLP). 3. 175-178 (2000)

Related Report

[Publications] 井本和範: "CALLシステムのための英語文強勢知覚のモデル化"電子情報通信学会技術研究報告. SP2000-1. (2000)

Related Report

[Publications] 鹿野清宏: "音声認識システム"オーム社. (2001)

Related Report

[Publications] T.Kawahara: "Japanese dictation toolkit - 1997 version-"J. Acoust. Soc. Japan (E). 20,3. 233-239 (1999)

Related Report

[Publications] 河原達也: "発話検証に基づく音声操作プロジェクタとそれによる講演の自動ハイパーテキスト化"情報処理学会論文誌. 40,4. 1491-1498 (1999)

Related Report