Robust HMMs against environmental variation for speech recognition
Project/Area Number |
10680376
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Shinshu University |
Principal Investigator |
MATSUMOTO Hiroshi Fac. of Engineering, Shinshu University, Professor, 工学部, 教授 (60005452)
|
Co-Investigator(Kenkyū-buntansha) |
内山 将夫 信州大学, 工学部, 助手 (70293496)
|
Project Period (FY) |
1998 – 1999
|
Project Status |
Completed (Fiscal Year 1999)
|
Budget Amount *help |
¥3,000,000 (Direct Cost: ¥3,000,000)
Fiscal Year 1999: ¥600,000 (Direct Cost: ¥600,000)
Fiscal Year 1998: ¥2,400,000 (Direct Cost: ¥2,400,000)
|
Keywords | Variance expansion / Noise HMM / Noisy speech recognition / HMM composition / Noise robustness / メル線形予測分析 |
Research Abstract |
(1) A Study on Variance Expansion of HMMs Robust to Environmental Variation This project addresses the problem of making HMMs robust to variation of SNR. This study developed a noise varicance expansion technique for HMMs, which consists of simply expanding the variace of cepstral coefficients for the noise model in HMM composition. The effect of this technique is examined through speaker independent digit recognition tests using NOISEX-92 noise data. The results show that the variance expansion of the 0th order cepstrum extremely improves robustness to a wide range of SNR mismatch over the standard HMM. The appropriate expansion factor is determined irrespective of noise types such that the expanded variance of the zeroth cepstrum is around 5 to 6dB with respect to its geometric mean. (2) A Stuty on a Robust Spectral Analysis to Additive Noise A part of this project also developed a simple and efficient time domain technique to estimate an all-poll model on a mel-frequency axis (Mel-LPC). This method requires only two-fold computational cost as compared to conventional linear prediction analysis. Gender-dependent phoneme recognition tests show that the Mel-LPC cepstrum attains a significant improvement in recognition accuracy over conventional LP mel-cepstra and the mel-frequency cepstrum coefficients (MFCC). Furthermore, noisy word recognition tests revealed that the Mel-LPC cepstrum is robust to wide-band additive noise over conventional LP mel-cepstrum and MFCC.
|
Report
(3 results)
Research Products
(6 results)