Project/Area Number |
63550253
|
Research Category |
Grant-in-Aid for General Scientific Research (C)
|
Allocation Type | Single-year Grants |
Research Field |
電子通信系統工学
|
Research Institution | Nagoya Institute of Technology |
Principal Investigator |
KITAMURA Tadashi Faculty of Engineering, Nagoya Institute of Technology, Associate Professor, 工学部, 助教授 (60114865)
|
Co-Investigator(Kenkyū-buntansha) |
早原 悦朗 名古屋工業大学, 工学部, 教授 (80024214)
山田 由之 名古屋工業大学, 工学部, 助手 (50024253)
|
Project Period (FY) |
1988 – 1989
|
Project Status |
Completed (Fiscal Year 1989)
|
Budget Amount *help |
¥2,100,000 (Direct Cost: ¥2,100,000)
Fiscal Year 1989: ¥200,000 (Direct Cost: ¥200,000)
Fiscal Year 1988: ¥1,900,000 (Direct Cost: ¥1,900,000)
|
Keywords | noise / word recognition / two-dimensional mel-cepstrum / Japanese digit / dynamic features of spectra / 雑音下での単語音声認識 / 数字音声 / 雑音下での音声認識 / 人間の聴覚特性 / メル周波数 / スペクトルの時間変化情報 |
Research Abstract |
The purpose of this research is to offer a new method for word recognition under noisy environments. In this study white noise generated by computer simulation and colored noise recorded in the Nagoya station are used. A speaker- independent word recognition method of ten Japanese digits using a two- dimensional mel-cepstrum(TDMC) is proposed. TDMC is defined as the two- dimensional Fourier transform of mel-frequency scaled logarithm spectra in the frequency and time domains and consists of average features and dynamic features of the two-dimensional mel-log spectra, Experimental results in this study are shown as follows. 1. Speech analysis-synthesis system using a TMDC and its estimation; The structure of speech analysis-synthesis system using a TMDC is proposed in order to study the size of the TDMC for synthesizing good quality speech. It is shown that the frequency of the required area of the TDMC is less than about 10Hz. 2. Reference patterns robust for the variation of signal-to-noise ratio (SNR) of input speech; In this study a single set of TDMCs of noise-added reference patterns with desired SNR is used for word recognition under noisy environments. Experimental results show that a recognition method using this reference pattern set is more effective than a usual method. 3. Distance measures for a word recognition method robust for the variation of SNR of input speech; Distance measures using a combination of dynamic and average features of the TDMC is proposed. It is shown that dynamic features are more important than average features for word recognition under noisy environment.
|