Project/Area Number |
10680382
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | NARA INSTITUTE OF SCIENCE AND TECHNOLOGY |
Principal Investigator |
SHIKANO Kiyohiro Nara Institute of Science and Technology, Graduate School of Information Science, Associate Professor, 情報科学研究科, 教授 (00263426)
|
Co-Investigator(Kenkyū-buntansha) |
KAWANAMI Hiromichi Nara Institute of Science and Technology, Graduate School of Information Science, Assistant, 情報科学研究科, 助手 (80335489)
LEE Akinobu Nara Institute of Science and Technology, Graduate School of Information Science, Assistant, 情報科学研究科, 助手 (80332766)
猿渡 洋 奈良先端科学技術大学院大学, 情報科学研究科, 助教授 (30324974)
|
Project Period (FY) |
1998 – 2001
|
Keywords | Japanese dictation system / Unsupervised speaker adaptation / Noise adaptation / Spoken language model / Task-oriented language model / Spoken dialog system |
Research Abstract |
We have been developing several noise and speaker adaptation algorithms for Japanese dictation systems. We have been also developing task-oriented language models. These research results are summarized as follows : 1. Unsupervised speaker adaptation algorithm and its evaluation. Unsupervised speaker adaptation algorithms based pn HMM sufficient statistics and speaker selection are newly invented. These speaker adaptation algorithms require only one arbitrary speech utterance for adaptation. Speaker adapted HMM phone models are constructed from HMM sufficient statistics from selected speakers. The effectiveness of the Unsupervised speaker adaptation algorithms are successfully evaluated using a large scale of speech database. 2. Environment noise adaptation algorithm and its evaluation Above unsupervised speaker adaptation algorithms are successfully extended to deal with environment noise adaptation. Moreover, spectral subtraction is introduced to improve the environment noise adaptation algorithm. Simultaneous speaker and noise adaptation is carried out based on HMM sufficient statistics and speaker selection. The effectiveness is also successfully evaluated using a large scale of speech database. 3. Task-oriented language model and construction of spoken dialog system Automatic language model construction tools are implemented using a Web search robot and a Japanese text extraction tool based on character trigrams. We construct a university receptionist robot system, and evaluate these algorithms. We are collecting a large amount of speech database in real environments, using the receptionist robot system. 4. Technology transfer to the public The developed speaker and noise adaptation algorithms are transferred to the public through "Continuous Speech Recongnition Consortium" in the Information Processing Society.
|