1993 Fiscal Year Final Research Report Summary
A Study for Utilizing the Linguistic Information in Phoneme Recognition to Understand Continuous Speech
Project/Area Number |
03452173
|
Research Category |
Grant-in-Aid for General Scientific Research (B)
|
Allocation Type | Single-year Grants |
Research Field |
情報工学
|
Research Institution | Chiba Institute of Technology |
Principal Investigator |
KIDO Ken'iti Chiba Inst. of Tech., Engineering, Prof., 工学部, 教授 (30006209)
|
Co-Investigator(Kenkyū-buntansha) |
MAKINO Shozo Tokyo Univ., Research Center for Applied Information Sciences, Associate Prof., 応用情報学研究センタ, 助教授 (00089806)
ARAI Shuichi Chiba Inst. of Tech., Engineering, Associate Prof., 工学部, 講師 (20212590)
UKIGAI Masahiro Chiba Inst. of Tech., Engineering, Associate Prof., 工学部, 助教授 (80118695)
SUGAWARA Kenji Chiba Inst. of Tech., Engineering, Prof., 工学部, 教授 (00137853)
MIIDA Yoshiro Chiba Inst. of Tech., Engineering, Prof., 工学部, 教授 (10083859)
|
Project Period (FY) |
1991 – 1993
|
Keywords | Continuous Speech Recognition / Speech Recognition / Phoneme Recognition / Speaker Independent / Linguistic Information |
Research Abstract |
In this study, we proposed 2 higher performance phoneme recognition methodsand the continuous speech recognition method utilizing the linguistic information around the target phoneme. At first, we proposed MR-HMM (Multi-Resolution HMM) based on Wavelet transform, which is able to control the time-frequency resolution. The WTD (Wavelet transform Tree Data) is proposed to represent the time-frequency space in scalogram that is obtained through Wavelet transform. Using this WTD structure, we proposed the State merge Algorithm stucying MR-HMM, it enables the high recognition rate. Next, we proposed the phoneme recognition method using the 9 acoustic features besides the cepstrum parameters that is most popular but not enough. In general, it is necessary for using the several kinds of acoustic parameters to analyze what parameters are suitable for the specified phoneme recognition. But, the proposed method enables using the several kinds of parameters except that. We proposed the Membership Scale to enable applying the linear discriminant method that is for 2 category discrimination to the multi category discrimination. Using this method, the linguistic recognition stage can get the reliability of the results from the acoustical recognition stage. Finally, we proposed the new linguistic recognition method, that uses the co-occurative relationship of the words in one sentence. This method doesn't use the grammatical knowledge, so the task fre speech is available. Combining this linguistic recognition method with the acoustic recognition methods mentioned above, the misrecognition in the acoustical recognition stage can be controlled by the linguistic rrecognition stage. From the experimental results, we confirmed the effectiveness of the proposed recognition methods.
|
Research Products
(6 results)