Large-vocabulary continuous speech recognition on spontaneous speech task
Project/Area Number |
18500126
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Yamagata University |
Principal Investigator |
KOHDA Masaki Yamagata University, Graduate School of Science and Engineering, Professor (00205337)
|
Co-Investigator(Kenkyū-buntansha) |
KOSAKA Tetsuo Yamagata University, Graduate School of Science and Engineering, Associate Professor (50359569)
KATOH Masaharu Yamagata University, Graduate School of Science and Engineering, Research Associate (10250953)
|
Project Period (FY) |
2006 – 2007
|
Project Status |
Completed (Fiscal Year 2007)
|
Budget Amount *help |
¥1,910,000 (Direct Cost: ¥1,700,000、Indirect Cost: ¥210,000)
Fiscal Year 2007: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2006: ¥1,000,000 (Direct Cost: ¥1,000,000)
|
Keywords | Corpus of Spontaneous Japanese / Speech recognition / Acoustic model / Language model / Unsupervised adaptation / System integration / Robust speech recognition / 混合連続分布HMM / 離散混合分布HMM |
Research Abstract |
1. Large-vocabulary continuous speech recognition on spontaneous speech task In large-vocabulary continuous speech recognition, we investigate several methods of unsupervised adaptation of both acoustic and language models and evaluate the methods on the Corpus of Spontaneous Japanese (CSJ). The LVCSR system has full-covariance matrices as the acoustic model. The results of recognition experiments showed the decrease in word error rate (WER) from 19.17% without adaptation to 14.73% with unsupervised adaptation, moreover to 14.47% with unsupervised adaptation by weighting the adaptation data on the basis of a part of speech. Also, we compared the performance between continuous-mixture FRAM (CHMM) system and discrete-mixture HMM (DMHMM) system on the CSJ. As a result, DMHMM system provided almost the same performance as the CHMM system and WER of 19.73% had been obtained with 6000-state 24-mixture DMHMMs, though it has been generally believed that the recognition error rates of DMHMM were
… More
much higher than those of CHMM until now. 2. Robust speech recognition using discrete-mixture HMMs We introduce a new method of robust speech recognition under noisy conditions based on discrete-mixture HMMs (DMHMMs). DMHMMs were originally proposed to reduce calculation costs in the decoding process. Recently, we have applied DMHMMs to noisy speech recognition, and found that they were effective for modeling noisy speech. Towards the further improvement of noise-robust speech recognition, we propose a novel normalization method for DMHMMs based on histogram equalization (HEQ). The HEQ method can compensate the nonlinear effects of additive noise. It is generally used for the feature space normalization of continuous-mixture HMM (CHMM) systems. In this paper, we propose both model space and feature space normalization of DMHMMs by using HEQ. In the model space normalization, codebooks of DMHMMs are modified by the transform function derived from the HEQ method. The proposed method was compared using both conventional CHMMs and DMHMMs. The results showed that the model space normalization of DMHMMs by multiple transform functions was effective for noise-robust speech recognition. Less
|
Report
(3 results)
Research Products
(57 results)