• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2002 Fiscal Year Final Research Report Summary

Flexible Spoken Language Processing for Automatic Transcription of Lectures and Meetings

Research Project

Project/Area Number 12480085
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field Intelligent informatics
Research InstitutionKYOTO UNIVERSITY

Principal Investigator

KAWAHARA Tatsuya  Kyoto University, Graduate School of Informatics, Associate Professor, 情報学研究科, 助教授 (00234104)

Co-Investigator(Kenkyū-buntansha) DOSHITA Shuji  Ryukoku University, Faculty of Science and Technology, Professor, 理工学部, 教授 (00025925)
IKEDA Katsuo  Osaka Institute of Technology, Department of Information Science, Professor, 情報科学部, 教授 (30026009)
KUROHASHI Sadao  The University of Tokyo, Graduate School of Information Science and Technology, Associate Professor, 情報処理工学系研究科, 助教授 (50263108)
OKUNO Hiroshi  Kyoto University, Graduate School of Informatics, Professor, 情報学研究科, 教授 (60318201)
SATO Satoshi  Kyoto University, Graduate School of Informatics, Associate Professor, 情報学研究科, 助教授 (30205918)
Project Period (FY) 2000 – 2002
Keywordsspoken language processing / speech recognition / spontaneous speech / acoustic model / language model / HMM / N-gram
Research Abstract

Automatic transcription of lectures is addressed using the corpus of spontaneous Japanese collected under the priority research project in Japan. First, we investigate the effect of speaking style and data amount for acoustic modeling. Then, to complement training data for language model, incorporation of other text corpora with optimization of mixture weights is performed. We also implement a sequential decoding method that does not need prior segmentation of lecture recordings
Then, we investigate the acoustic, pronunciation and language modeling for improving the accuracy focusing the following issues
(1) Speaking-rate dependent decoding and adaptation of acoustic model
(2) Statistical modeling of pronunciation variations and unsupervised adaptation of language model Furthermore, we also study the following spoken language processings
(3) Automatic indexing of lecture audio by extracting topic-independent discourse markers
(4) Automatic transformation of lecture transcription into document style using statistical framework
(5) Extraction of important sentences from lectures using statistics of discourse markers and topic words

  • Research Products

    (19 results)

All Other

All Publications (19 results)

  • [Publications] 南條浩輝: "大規模な日本語話し言葉データベースを用いた講演音声認識"電子情報通信学会論文誌. J86-DII, 4. 450-459 (2003)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] 長谷川将宏: "談話標識の抽出に基づいた講演音声の自動インデキシング"情報処理学会論文誌. 43,7. 2222-2229 (2002)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] H.Nanjo: "Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition"Proc. IEEE-ICASSP. 1. 725-728 (2002)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] T.Kawahara: "Automatic transcription of spontaneous lecture speech"IEEE workshop Automatic Speech Recognition and Understanding. (2001)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] 河原達也: "日本語ディクテーション基本ソフトウェア(99年度版)"日本音響学会誌. 57・3. 210-214 (2001)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] 河原達也: "話し言葉音声認識の概観"電子情報通信学会技術研究報告. SP2000-95. (2000)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] 鹿野清宏: "音声認識システム"オーム社. 200 (2001)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] T. Kawahara, H. Nanjo, T. Shinozaki, S. Furui: "Benchmark test for speech recognition us ing the Corpus of Spontaneous Japanese"Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition. (2003)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] H. Nanjo, K. Shitaoka, T. Kawahara: "Automatic transformation of lecture transcription into document style using statistical framework"Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition. (2003)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] H. Nanjo, T. Kawahara: "Unsupervised language model adaptation for lecture speech recognition"Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition. (2003)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Y. Akita, M. Nishida, T. Kawahara: "Automatic transcription of discussions using unsupervised speaker indexing"Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition. (2003)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] T. Kawahara, M. Hasegawa: "Automatic indexing of lecture speech by extracting topic-independent discourse markers"Proc. IEEE-ICASSP. 1-4 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] H. Nanjo, T. Kawahara: "Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition"Proc. IEEE-ICASSP. 725-728 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] T. Kawahara, H. Nanjo, S. Furui: "Automatic transcription of spontaneous lecture speech"Proc. IEEE workshop on Automatic Speech Recognition and Understanding. (2001)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] H. Nanjo, K. Kato, T. Kawahara.: "Speaking rate dependent acoustic modeling for spontaneous lecture speech recognition"Proc. EUROSPEECH. 2531-2534 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] A. Lee, T. Kawahara, K. Shikano.: "Julius -- an open source real-time large vocabulary recognition engine"Proc. EUEOSPEECH. 1691-1694 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] A. Lee, T. Kawahara, K. Shikano: "Gaussian mixture selection using context-independent HMM"Proc. IEEE-ICASSP. 69-72 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] K. Kato, H. Nanjo, T. Kawahara: "Automatic transcription of lecture speech using topic-independent language modeling"Proc. ICSLP. Vol. 1. 162-165 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] A. Lee, T. Kawahara, K. Takeda, K. Shikano: "A new phonetic tied-mixture model for efficient decoding"Proc. IEEE-ICASSP. 1269-1272 (2000)

    • Description
      「研究成果報告書概要(欧文)」より

URL: 

Published: 2004-04-14  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi