Spontaneous speech recognition
Project/Area Number |
15500098
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Yamagata University |
Principal Investigator |
KOHDA Masaki Yamagata University, Faculty of Engineering, Professor, 工学部, 教授 (00205337)
|
Co-Investigator(Kenkyū-buntansha) |
KOSAKA Tetsuo Yamagata University, Faculty of Engineering, Associate Professor, 工学部, 助教授 (50359569)
KATOH Masaharu Yamagata University, Faculty of Engineering, Research Associate, 工学部, 助手 (10250953)
|
Project Period (FY) |
2003 – 2005
|
Project Status |
Completed (Fiscal Year 2005)
|
Budget Amount *help |
¥3,200,000 (Direct Cost: ¥3,200,000)
Fiscal Year 2005: ¥800,000 (Direct Cost: ¥800,000)
Fiscal Year 2004: ¥1,000,000 (Direct Cost: ¥1,000,000)
Fiscal Year 2003: ¥1,400,000 (Direct Cost: ¥1,400,000)
|
Keywords | Corpus of Spontaneous Japanese / Spontaneous speech recognition / Robust speech recognition / Acoustic model / Language model / Unsupervised adaptation / Continuous-mixture HMM / Discrete-mixture HMM / 音声認識 / 発音変形依存モデル / MLLR / 品詞N-gram |
Research Abstract |
We investigated spontaneous speech recognition on academic lecture task and obtained the following results. (1) Lecture speech recognition using pronunciation variant modeling and unsupervised adaptation We focus on the pronunciation variations observed in spontaneous speech. Aiming to introduce the context-dependence of pronunciation variants, we propose a new method of language modeling based on morphological analysis data designed for pronunciation variant. The proposed method was evaluated on the Corpus of Spontaneous Japanese (CSJ) and achieved the decrease in word error rate (WER) by 4.74% absolute. In addition, unsupervised adaptation of both acoustic and language models was introduced to improve the recognition performance further. The results showed the decrease in WER from 19.96% without adaptation to 15.41% with unsupervised adaptation. (2) Lecture speech recognition using discrete-mixture HMMs We have investigated noisy speech recognition by using discrete-mixture HMM (DMHMM),
… More
and found that the performance of DMHMM overcame that of continuous-mixture HMM under environmental noise conditions or impulsive noise conditions. However, it is not clear whether this method is effective in clean conditions. The aim of this investigation is to evaluate the performance of the DMHMM system in clean conditions. In evaluation, we decided to use the "Corpus of Spontaneous Japanese" (CSJ) because we want to compare the performance of our system with that of other recognition systems with common speech corpus, and clarify the performance in such a more difficult task. In the recognition experiments, 3000-state DMHMMs (16 mixture components per state) were used as acoustic models. The language model which represents the pronunciation variety was trained by using 6.86 million words from 2668 lectures in CSJ and was used for recognition. As a result, the system obtained 20.30% WER for 10 academic lectures uttered by male speakers and demonstrated the effectiveness of the proposed method. Less
|
Report
(4 results)
Research Products
(44 results)