2006 Fiscal Year Final Research Report Summary
User Friendly Speech Recognition Algorithm with Adaptability for Environments and Users
Project/Area Number |
15300060
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Nara Institute of Science and Technology |
Principal Investigator |
SHIKANO Kiyohiro Nara Institute of Science and Technology, Graduate School of Information Science, Professor, 情報科学研究科, 教授 (00263426)
|
Co-Investigator(Kenkyū-buntansha) |
SARUWATARI Hiroshi Nara Institute of Science and Technology, Graduate School of Information Science, Associate Professor, 情報科学研究科, 助教授 (30324974)
TODA Tomoki Nara Institute of Science and Technology, Graduate School of Information Science, Assistant Professor, 情報科学研究科, 助手 (90403328)
KAWANAMI Hiromichi Nara Institute of Science and Technology, Graduate School of Information Science, Assistant Professor, 情報科学研究科, 助手 (80335489)
|
Project Period (FY) |
2003 – 2006
|
Keywords | speech recognition / speech dialog system / hands-free speech recognition / speaker adaptation / speech database / phoneme model / language model / non-audible murmur |
Research Abstract |
The research plan includes the following topics, 1. Noise reduction signal processing, accurate phoneme modeling, and speaker and environment adaptation. 2. Task adapted language model for accepting spontaneous utterances. 3. Hands-free speech recognition interface. 4. Study of human factors for speech dialog system. These investigations have been studied based on real world speech dialog systems. The main attained research results are summarized as follows. 1. Unsupervised speaker adaptation algorithm based on an arbitrary utterance has been developed, which takes only several seconds, and shows almost same accuracy as supervised adaptation MLLR based on 50 utterances. 2. Task adapted language models have been developed for children and adults using two year Takemaru-kun transcribed texts. By introducing parallel decoding for children and adults, we attain the improvements of word accuracy and response accuracy. 3. Hands-free speech recognition has been implemented based on null-beamformer type SSA (Spatial Subtraction Array) and BSSA (Blind SSA) with SIMO-ICA. 4. We have been successfully operating Takemaru-kun speech guidance system in Ikoma North community center these four years. We have been also operating two speech guidance systems in local railway station this one year, to study about noisy condition. The collected speech database with the transcription is useful to develop speech dialog systems. We also invented a new quiet speech media, Non-Audible Murmur (NAM), which applied to quiet speech recognition and quiet telephone. This invention awarded by IEICE paper prize and Inose award.
|
Research Products
(301 results)