Project/Area Number |
15300060
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Nara Institute of Science and Technology |
Principal Investigator |
SHIKANO Kiyohiro Nara Institute of Science and Technology, Graduate School of Information Science, Professor, 情報科学研究科, 教授 (00263426)
|
Co-Investigator(Kenkyū-buntansha) |
SARUWATARI Hiroshi Nara Institute of Science and Technology, Graduate School of Information Science, Associate Professor, 情報科学研究科, 助教授 (30324974)
TODA Tomoki Nara Institute of Science and Technology, Graduate School of Information Science, Assistant Professor, 情報科学研究科, 助手 (90403328)
KAWANAMI Hiromichi Nara Institute of Science and Technology, Graduate School of Information Science, Assistant Professor, 情報科学研究科, 助手 (80335489)
李 晃伸 奈良先端科学技術大学院大学, 情報科学研究科, 助手 (80332766)
|
Project Period (FY) |
2003 – 2006
|
Project Status |
Completed (Fiscal Year 2006)
|
Budget Amount *help |
¥16,100,000 (Direct Cost: ¥16,100,000)
Fiscal Year 2006: ¥4,500,000 (Direct Cost: ¥4,500,000)
Fiscal Year 2005: ¥3,200,000 (Direct Cost: ¥3,200,000)
Fiscal Year 2004: ¥3,200,000 (Direct Cost: ¥3,200,000)
Fiscal Year 2003: ¥5,200,000 (Direct Cost: ¥5,200,000)
|
Keywords | speech recognition / speech dialog system / hands-free speech recognition / speaker adaptation / speech database / phoneme model / language model / non-audible murmur / 音声情報案内システム / 音声質問応答データベース / ブラインド音源分離 / 話者・環境適応 / 音声による年齢層識別 / 雑音モデル / 無音声認識・無音声電話 |
Research Abstract |
The research plan includes the following topics, 1. Noise reduction signal processing, accurate phoneme modeling, and speaker and environment adaptation. 2. Task adapted language model for accepting spontaneous utterances. 3. Hands-free speech recognition interface. 4. Study of human factors for speech dialog system. These investigations have been studied based on real world speech dialog systems. The main attained research results are summarized as follows. 1. Unsupervised speaker adaptation algorithm based on an arbitrary utterance has been developed, which takes only several seconds, and shows almost same accuracy as supervised adaptation MLLR based on 50 utterances. 2. Task adapted language models have been developed for children and adults using two year Takemaru-kun transcribed texts. By introducing parallel decoding for children and adults, we attain the improvements of word accuracy and response accuracy. 3. Hands-free speech recognition has been implemented based on null-beamformer type SSA (Spatial Subtraction Array) and BSSA (Blind SSA) with SIMO-ICA. 4. We have been successfully operating Takemaru-kun speech guidance system in Ikoma North community center these four years. We have been also operating two speech guidance systems in local railway station this one year, to study about noisy condition. The collected speech database with the transcription is useful to develop speech dialog systems. We also invented a new quiet speech media, Non-Audible Murmur (NAM), which applied to quiet speech recognition and quiet telephone. This invention awarded by IEICE paper prize and Inose award.
|