Flexible Spoken Language Processing for Automatic Transcription of Lectures and Meetings

Research Project

Project/Area Number	12480085
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	KYOTO UNIVERSITY
Principal Investigator	KAWAHARA Tatsuya Kyoto University, Graduate School of Informatics, Associate Professor, 情報学研究科, 助教授 (00234104)
Co-Investigator(Kenkyū-buntansha)	DOSHITA Shuji Ryukoku University, Faculty of Science and Technology, Professor, 理工学部, 教授 (00025925) IKEDA Katsuo Osaka Institute of Technology, Department of Information Science, Professor, 情報科学部, 教授 (30026009) KUROHASHI Sadao The University of Tokyo, Graduate School of Information Science and Technology, Associate Professor, 情報処理工学系研究科, 助教授 (50263108) OKUNO Hiroshi Kyoto University, Graduate School of Informatics, Professor, 情報学研究科, 教授 (60318201) SATO Satoshi Kyoto University, Graduate School of Informatics, Associate Professor, 情報学研究科, 助教授 (30205918)
Project Period (FY)	2000 – 2002
Project Status	Completed (Fiscal Year 2002)
Budget Amount *help	¥7,200,000 (Direct Cost: ¥7,200,000) Fiscal Year 2002: ¥1,500,000 (Direct Cost: ¥1,500,000) Fiscal Year 2001: ¥1,800,000 (Direct Cost: ¥1,800,000) Fiscal Year 2000: ¥3,900,000 (Direct Cost: ¥3,900,000)
Keywords	spoken language processing / speech recognition / spontaneous speech / acoustic model / language model / HMM / N-gram / 話者認識
Research Abstract	Automatic transcription of lectures is addressed using the corpus of spontaneous Japanese collected under the priority research project in Japan. First, we investigate the effect of speaking style and data amount for acoustic modeling. Then, to complement training data for language model, incorporation of other text corpora with optimization of mixture weights is performed. We also implement a sequential decoding method that does not need prior segmentation of lecture recordings Then, we investigate the acoustic, pronunciation and language modeling for improving the accuracy focusing the following issues (1) Speaking-rate dependent decoding and adaptation of acoustic model (2) Statistical modeling of pronunciation variations and unsupervised adaptation of language model Furthermore, we also study the following spoken language processings (3) Automatic indexing of lecture audio by extracting topic-independent discourse markers (4) Automatic transformation of lecture transcription into document style using statistical framework (5) Extraction of important sentences from lectures using statistics of discourse markers and topic words

Report

(4 results)

2002 Annual Research Report Final Research Report Summary
2001 Annual Research Report
2000 Annual Research Report

Research Products
(39 results)

All Other

All Publications (39 results)

[Publications] 南條浩輝: "大規模な日本語話し言葉データベースを用いた講演音声認識"電子情報通信学会論文誌. J86-DII, 4. 450-459 (2003)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] 長谷川将宏: "談話標識の抽出に基づいた講演音声の自動インデキシング"情報処理学会論文誌. 43,7. 2222-2229 (2002)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] H.Nanjo: "Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition"Proc. IEEE-ICASSP. 1. 725-728 (2002)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] T.Kawahara: "Automatic transcription of spontaneous lecture speech"IEEE workshop Automatic Speech Recognition and Understanding. (2001)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] 河原達也: "日本語ディクテーション基本ソフトウェア(99年度版)"日本音響学会誌. 57・3. 210-214 (2001)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] 河原達也: "話し言葉音声認識の概観"電子情報通信学会技術研究報告. SP2000-95. (2000)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] 鹿野清宏: "音声認識システム"オーム社. 200 (2001)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] T. Kawahara, H. Nanjo, T. Shinozaki, S. Furui: "Benchmark test for speech recognition us ing the Corpus of Spontaneous Japanese"Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition. (2003)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] H. Nanjo, K. Shitaoka, T. Kawahara: "Automatic transformation of lecture transcription into document style using statistical framework"Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition. (2003)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] H. Nanjo, T. Kawahara: "Unsupervised language model adaptation for lecture speech recognition"Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition. (2003)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] Y. Akita, M. Nishida, T. Kawahara: "Automatic transcription of discussions using unsupervised speaker indexing"Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition. (2003)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] T. Kawahara, M. Hasegawa: "Automatic indexing of lecture speech by extracting topic-independent discourse markers"Proc. IEEE-ICASSP. 1-4 (2002)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] H. Nanjo, T. Kawahara: "Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition"Proc. IEEE-ICASSP. 725-728 (2002)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] T. Kawahara, H. Nanjo, S. Furui: "Automatic transcription of spontaneous lecture speech"Proc. IEEE workshop on Automatic Speech Recognition and Understanding. (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] H. Nanjo, K. Kato, T. Kawahara.: "Speaking rate dependent acoustic modeling for spontaneous lecture speech recognition"Proc. EUROSPEECH. 2531-2534 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] A. Lee, T. Kawahara, K. Shikano.: "Julius -- an open source real-time large vocabulary recognition engine"Proc. EUEOSPEECH. 1691-1694 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] A. Lee, T. Kawahara, K. Shikano: "Gaussian mixture selection using context-independent HMM"Proc. IEEE-ICASSP. 69-72 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] K. Kato, H. Nanjo, T. Kawahara: "Automatic transcription of lecture speech using topic-independent language modeling"Proc. ICSLP. Vol. 1. 162-165 (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] A. Lee, T. Kawahara, K. Takeda, K. Shikano: "A new phonetic tied-mixture model for efficient decoding"Proc. IEEE-ICASSP. 1269-1272 (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2002 Final Research Report Summary
[Publications] T.Kawahara: "Automatic indexing of lecture speech by extracting topic-independent discourse mark-ers"Proc. IEEE-ICASSP. 1. 1-4 (2002)
- Related Report
  2002 Annual Research Report
[Publications] 河原達也: "連続音声認識コンソーシアム2001年度版ソフトウエアの概要"情報処理学会研究報告. SLP-43-3. (2002)
- Related Report
  2002 Annual Research Report
[Publications] 南條浩輝: "大規模な日本語話し言葉データベースを用いた講演音声認識"電子情報通信学会論文誌. J86-DII,4. (2003)
- Related Report
  2002 Annual Research Report
[Publications] 長谷川将宏: "談話標識の抽出に基づいた講演音声の自動インデキシング"情報処理学会論文誌. 43,7. 2222-2229 (2002)
- Related Report
  2002 Annual Research Report
[Publications] 李晃伸: "音素環境独立HMMを用いた混合ガウス分布選択による音響尤度計算の削減"情報処理学会論文誌. 43,7. 2214-2221 (2002)
- Related Report
  2002 Annual Research Report
[Publications] H.Nanjo: "Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition"Proc. IEEE-ICASSP. 1. 725-727 (2002)
- Related Report
  2002 Annual Research Report
[Publications] T.Kawahara: "Automatic transcription of spontaneous lecture speech"IEEE workshop Automatic Speech Recognition and Understanding. (2001)
- Related Report
  2001 Annual Research Report
[Publications] 河原達也: "連続音声認識コンソーシアム2000年度版ソフトウェアの概要と評価"情報処理学会研究報告. SLP-38-6. (2001)
- Related Report
  2001 Annual Research Report
[Publications] 河原達也: "話し言葉音声認識のための言語モデルとデコーダの改善"情報処理学会研究報告. SLP-36-3. (2001)
- Related Report
  2001 Annual Research Report
[Publications] M.Mimura: "Difference of acoustic modeling for read speech and dialogue speech"Acoustical Science & Technology. 22. 373-374 (2001)
- Related Report
  2001 Annual Research Report
[Publications] H.Nanjo: "Speaking rate dependent acoustic modeling for spontaneous lecture speech recognition"Proc.EUROSPEECH. 2531-2534 (2001)
- Related Report
  2001 Annual Research Report
[Publications] A.Lee: "Gaussian mixture selection using context-independent HMM"Proc.IEEE-ICASSP. 1. 69-72 (2001)
- Related Report
  2001 Annual Research Report
[Publications] 鹿野清宏: "音声認識システム"オーム社. 200 (2001)
- Related Report
  2001 Annual Research Report
[Publications] 河原達也: "日本語ディクテーション基本ソフトウェア(99年度版)"日本音響学会誌. 57,3. 210-214 (2001)
- Related Report
  2000 Annual Research Report
[Publications] 李晃伸: "Phonetic Tied-Mixtureモデルを用いた大語彙連続音声認識"電子情報通信学会論文誌. J83-DII,12. 2517-2525 (2000)
- Related Report
  2000 Annual Research Report
[Publications] K.Komatani: "Flexible mixed-initiative dialogue management using concept-level confidence measures of speech recognizer output"Proc.Int'l Conf.Computational Linguistics(COLING). 467-473 (2000)
- Related Report
  2000 Annual Research Report
[Publications] K.Kato: "Automatic Transcription of Lecture Speech using Topic-Independent Language Modeling"Proc.Int'l Conf.Spoken Language Processing(ICSLP). 1. 162-165 (2000)
- Related Report
  2000 Annual Research Report
[Publications] 河原達也: "話し言葉音声認識の概観"電子情報通信学会技術研究報告. SP2000-95. (2000)
- Related Report
  2000 Annual Research Report
[Publications] 加藤一臣: "講演音声認識のための音響・言語モデルの検討"電子情報通信学会技術研究報告. SP2000-97. (2000)
- Related Report
  2000 Annual Research Report
[Publications] 鹿野清宏: "音声認識システム"オーム社. (2001)
- Related Report
  2000 Annual Research Report

Flexible Spoken Language Processing for Automatic Transcription of Lectures and Meetings

Principal Investigator

KAWAHARA Tatsuya Kyoto University, Graduate School of Informatics, Associate Professor, 情報学研究科, 助教授 (00234104)

¥7,200,000 (Direct Cost: ¥7,200,000)

Report

Research Products

[Publications] 南條浩輝: "大規模な日本語話し言葉データベースを用いた講演音声認識"電子情報通信学会論文誌. J86-DII, 4. 450-459 (2003)

Description

Related Report

[Publications] 長谷川将宏: "談話標識の抽出に基づいた講演音声の自動インデキシング"情報処理学会論文誌. 43,7. 2222-2229 (2002)

Description

Related Report

[Publications] H.Nanjo: "Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition"Proc. IEEE-ICASSP. 1. 725-728 (2002)

Description

Related Report

[Publications] T.Kawahara: "Automatic transcription of spontaneous lecture speech"IEEE workshop Automatic Speech Recognition and Understanding. (2001)

Description

Related Report

[Publications] 河原達也: "日本語ディクテーション基本ソフトウェア(99年度版)"日本音響学会誌. 57・3. 210-214 (2001)

Description

Related Report

[Publications] 河原達也: "話し言葉音声認識の概観"電子情報通信学会技術研究報告. SP2000-95. (2000)

Description

Related Report

[Publications] 鹿野清宏: "音声認識システム"オーム社. 200 (2001)

Description

Related Report

[Publications] T. Kawahara, H. Nanjo, T. Shinozaki, S. Furui: "Benchmark test for speech recognition us ing the Corpus of Spontaneous Japanese"Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition. (2003)

Description

Related Report

[Publications] H. Nanjo, K. Shitaoka, T. Kawahara: "Automatic transformation of lecture transcription into document style using statistical framework"Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition. (2003)

Description

Related Report

[Publications] H. Nanjo, T. Kawahara: "Unsupervised language model adaptation for lecture speech recognition"Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition. (2003)

Description

Related Report

[Publications] Y. Akita, M. Nishida, T. Kawahara: "Automatic transcription of discussions using unsupervised speaker indexing"Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition. (2003)

Description

Related Report

[Publications] T. Kawahara, M. Hasegawa: "Automatic indexing of lecture speech by extracting topic-independent discourse markers"Proc. IEEE-ICASSP. 1-4 (2002)

Description

Related Report

[Publications] H. Nanjo, T. Kawahara: "Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition"Proc. IEEE-ICASSP. 725-728 (2002)

Description

Related Report

[Publications] T. Kawahara, H. Nanjo, S. Furui: "Automatic transcription of spontaneous lecture speech"Proc. IEEE workshop on Automatic Speech Recognition and Understanding. (2001)

Description

Related Report

[Publications] H. Nanjo, K. Kato, T. Kawahara.: "Speaking rate dependent acoustic modeling for spontaneous lecture speech recognition"Proc. EUROSPEECH. 2531-2534 (2001)

Description

Related Report

[Publications] A. Lee, T. Kawahara, K. Shikano.: "Julius -- an open source real-time large vocabulary recognition engine"Proc. EUEOSPEECH. 1691-1694 (2001)

Description

Related Report

[Publications] A. Lee, T. Kawahara, K. Shikano: "Gaussian mixture selection using context-independent HMM"Proc. IEEE-ICASSP. 69-72 (2001)

Description

Related Report

[Publications] K. Kato, H. Nanjo, T. Kawahara: "Automatic transcription of lecture speech using topic-independent language modeling"Proc. ICSLP. Vol. 1. 162-165 (2000)

Description

Related Report

[Publications] A. Lee, T. Kawahara, K. Takeda, K. Shikano: "A new phonetic tied-mixture model for efficient decoding"Proc. IEEE-ICASSP. 1269-1272 (2000)

Description

Related Report

[Publications] T.Kawahara: "Automatic indexing of lecture speech by extracting topic-independent discourse mark-ers"Proc. IEEE-ICASSP. 1. 1-4 (2002)

Related Report

[Publications] 河原達也: "連続音声認識コンソーシアム2001年度版ソフトウエアの概要"情報処理学会研究報告. SLP-43-3. (2002)

Related Report

[Publications] 南條浩輝: "大規模な日本語話し言葉データベースを用いた講演音声認識"電子情報通信学会論文誌. J86-DII,4. (2003)

Related Report

[Publications] 長谷川将宏: "談話標識の抽出に基づいた講演音声の自動インデキシング"情報処理学会論文誌. 43,7. 2222-2229 (2002)

Related Report

[Publications] 李晃伸: "音素環境独立HMMを用いた混合ガウス分布選択による音響尤度計算の削減"情報処理学会論文誌. 43,7. 2214-2221 (2002)

Related Report

[Publications] H.Nanjo: "Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition"Proc. IEEE-ICASSP. 1. 725-727 (2002)

Related Report

[Publications] T.Kawahara: "Automatic transcription of spontaneous lecture speech"IEEE workshop Automatic Speech Recognition and Understanding. (2001)

Related Report

[Publications] 河原達也: "連続音声認識コンソーシアム2000年度版ソフトウェアの概要と評価"情報処理学会研究報告. SLP-38-6. (2001)

Related Report

[Publications] 河原達也: "話し言葉音声認識のための言語モデルとデコーダの改善"情報処理学会研究報告. SLP-36-3. (2001)