Grant-in-Aid for Scientific Research (C).
|Research Institution||NARA INSTITUTE OF SCIENCE AND TECHNOLOGY|
SHIKANO Kiyohiro Nara Institute of Science and Technology, Graduate School of Information Science, Associate Professor, 情報科学研究科, 教授 (00263426)
陸 金林 奈良先端科学技術大学院大学, 情報科学研究科, 助手 (50230868)
中村 哲 奈良先端科学技術大学院大学, 情報科学研究科, 助教授 (30263429)
KAWANAMI Hiromichi Nara Institute of Science and Technology, Graduate School of Information Science, Assistant, 情報科学研究科, 助手 (80335489)
LEE Akinobu Nara Institute of Science and Technology, Graduate School of Information Science, Assistant, 情報科学研究科, 助手 (80332766)
猿渡 洋 奈良先端科学技術大学院大学, 情報科学研究科, 助教授 (30324974)
|Project Fiscal Year
1998 – 2001
Completed(Fiscal Year 2001)
|Budget Amount *help
¥3,500,000 (Direct Cost : ¥3,500,000)
Fiscal Year 2001 : ¥1,000,000 (Direct Cost : ¥1,000,000)
Fiscal Year 2000 : ¥500,000 (Direct Cost : ¥500,000)
Fiscal Year 1999 : ¥700,000 (Direct Cost : ¥700,000)
Fiscal Year 1998 : ¥1,300,000 (Direct Cost : ¥1,300,000)
|Keywords||Japanese dictation system / Unsupervised speaker adaptation / Noise adaptation / Spoken language model / Task-oriented language model / Spoken dialog system / 日本語ディクテーション / 教師なし話者適応 / 環境雑音適応 / 話し言葉言語モデル / タスク向き言語モデル / 音声対話システム / 環境適応 / 連続音声認識 / 話者適応 / タスク適応 / ディクテーション|
We have been developing several noise and speaker adaptation algorithms for Japanese dictation systems. We have been also developing task-oriented language models. These research results are summarized as follows :
1. Unsupervised speaker adaptation algorithm and its evaluation.
Unsupervised speaker adaptation algorithms based pn HMM sufficient statistics and speaker selection are newly invented. These speaker adaptation algorithms require only one arbitrary speech utterance for adaptation. Speaker adapted HMM phone models are constructed from HMM sufficient statistics from selected speakers. The effectiveness of the Unsupervised speaker adaptation algorithms are successfully evaluated using a large scale of speech database.
2. Environment noise adaptation algorithm and its evaluation
Above unsupervised speaker adaptation algorithms are successfully extended to deal with environment noise adaptation. Moreover, spectral subtraction is introduced to improve the environment noise adaptation algorithm. Simultaneous speaker and noise adaptation is carried out based on HMM sufficient statistics and speaker selection. The effectiveness is also successfully evaluated using a large scale of speech database.
3. Task-oriented language model and construction of spoken dialog system
Automatic language model construction tools are implemented using a Web search robot and a Japanese text extraction tool based on character trigrams. We construct a university receptionist robot system, and evaluate these algorithms. We are collecting a large amount of speech database in real environments, using the receptionist robot system.
4. Technology transfer to the public
The developed speaker and noise adaptation algorithms are transferred to the public through "Continuous Speech Recongnition Consortium" in the Information Processing Society.