1990 Fiscal Year Final Research Report Summary
Phonemic Segmentation and Word Recognition for Dialogue Level Continuous Speech
Project/Area Number |
01420028
|
Research Category |
Grant-in-Aid for General Scientific Research (A)
|
Allocation Type | Single-year Grants |
Research Field |
電子通信系統工学
|
Research Institution | Tokyo Institute of Technology |
Principal Investigator |
IMAI Satoshi Tokyo Institute of Technology Research Laboratory of Precision Machinery and Electronics Professor, 精密工学研究所, 教授 (50016763)
|
Co-Investigator(Kenkyū-buntansha) |
FURUICHI Chieko Tokyo Institute of Technology Research Laboratory of Precision Machinery and Ele, 精密工学研究所, 助手 (90016783)
|
Project Period (FY) |
1989 – 1990
|
Keywords | Continuous speech recognition / Word speech recognition / Phoneme recognition / Phonemic segmentation / Large vocabulary / Word spotting / Speech data base / Dialogue level |
Research Abstract |
Though this research project, we substantiated that the phoneme level segmentation, phoneme labeling and context-independent word recognition method was very effective for automatic continuous speech recognition. We got the following good results. (1) We realized a high performance automatic phonemic segmentation system for speaker and context independent continuous Japanese speech recognition. The segmentation algorithm is implemented as the hierarchical segmentation and broad category classification, using selected segmentation parameters and acoustic phonetic knowledge concerning continuous Japanese speech. The segmentation of continuous, reading-rate speech utterances and phonetically balanced word utterances with various phonetic environments into phonemic units is successfully performed. (2) We developed a high performance speaker-dependent Japanese phoneme recognition system based on the phonemic segmentation and labeling. Experiments were carried out with one female and one male speakers using 600 polysyllabic words in unspecified vocabulary to evaluate the system. The phoneme recognition accuracy was found to be 84.0% and 81.6% for each of a female speaker and a male. (3) We developed a context independent spoken word recognition system. The word recognition procedure is based on the three steps : phonemic segmentation, obtaining a phoneme lattice with the degree of confidence, and word recognition. The word recognition is performed by matching the phoneme lattice of unknown input speech with phonemic symbol sequences in the word dictionary. The word recognition rate of the first candidate was found to be 99.0% and 95.3% for each of a female speaker and a male. (4) We are now investigating a word spotting system. The word spotting is performed by the continuous DP matching of the phoneme lattice for unknown speech with phonemic symbol sequences in the word dictionary. The word spotting method yields a fairly good results.
|