2004 Fiscal Year Final Research Report Summary

Robust HMM speech recognition using robust time-varying complex speech analysis

Research Project

Project/Area Number	14550363
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	情報通信工学
Research Institution	University of the Ryukyus
Principal Investigator	FUNAKI Keiichi University of the Ryukyus, Computing and Networking center, Lecturer, 総合情報処理センター, 講師 (30315486)
Project Period (FY)	2002 – 2004
Keywords	speech analysis / complex signal processing / robust analysis / HMM speech recognition / time-varying analysis / ELS method / Feature extraction / HTK
Research Abstract	We have already proposed several robust time-varying complex AR(TV-CAR) speech analysis methods and we intend to realize robust speech recognition by means of adopting the TV-CAR method as a front-end of speech recognition. The TV-CAR methods adopt time-varying complex AR model as a speech production model in which AR parameter is represented by a complex basis expansion. The TV-CAR methods can estimate time-varying complex AR parameters for analytic speech signal. We have already proposed MMSE, M-estimation, IV, GLS(General Least Square) and ELS(Extended Least Square) method before 2002. A GLS and ELS method can estimate unbiased and less noise effected speech spectrum and can realize robust speech spectrum estimation. Since 2002, we have proposed more precise speech analysis, forward and backward linear prediction(FB-LP) based GLS and ELS algorithms and output error based ELS algorithm. We adopt HTK(HMM Tool Kit) as HMM speech recognition. In order to apply the TV-CAR method to the HTK, we have investigated parameter conversion from TV-CAR parameters to the HTK formatted LPC cepstrum coefficients(LPCC), as a result, we have realized HTK speech recognition using the TV-CAR method. Now we are evaluating the effectiveness of time-varying feature as well as complex analysis on HTK speech recognition. Furthermore, we will evaluate the effectiveness of robust speech analysis algorithm, viz. the ELS and FBLP based ELS.