2005 Fiscal Year Final Research Report Summary
Speech recognition accepting utterances including out-of-vocabularies
Project/Area Number |
14380168
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Waseda University |
Principal Investigator |
SAGISAKA Yoshinori Waseda University, Graduate School of Global Information and Telecommunication Studies, Professor, 大学院・国際情報通信研究科, 教授 (70339737)
|
Co-Investigator(Kenkyū-buntansha) |
SHIRAI Katsuhiko Waseda University, School of Science and Engineering, Professor, 理工学部, 教授 (10063702)
KOBAYASI Tsunori Waseda University, School of Science and Engineering, Professor, 理工学部, 教授 (30162001)
YAWIMOTO Hirofumi Advanced Telecommunications Research Institute International, Senior Researcher, 主任研究員 (00395013)
|
Project Period (FY) |
2002 – 2005
|
Keywords | statistical language model / out of vocabulary (OOV) / hierarchical language model / continuous speech recognition / task-free speech recognition |
Research Abstract |
A speech recognition scheme was studied to accept utterances including out-of-vocabularies (OOVs). A hierarchical statistical language model was newly proposed to cope with OOVs and speech recognition experiments have been carried out to confirm its effectiveness. In this language model, we described word-neighboring characteristics of unregistered expressions and constituent phonotactic constraints statistically independently to cope with unregistered expressions. The upper layer of this hierarchical model consists of inter-word statistics expressed by multi-dimensional composite word N-grams and the lower layer expresses infra-word statistical phonotactics using multi-dimensional composite sub-word units. A series of speech recognition experiments have shown that this language modeling enables the effective use of independent statistics and achieved high recognition performance for utterances including OOVs. By expandingthis lower layer model for single words such as personal names a
… More
nd city names to much longer named entity such as book titles and movie titles, we have successfully shown the validity of this modeling to other unregistered expressions consisting of multiple words. This success suggests that the proposed language model is effective for OOVs task independently and the possibility of a task-free statistical language model by integrating different statistical constraints independently. In speech recognition experiments, long unregistered expressions for movie titles were expressed by multi-dimensionalcomposite word N-grams as a lower-layer model. Experimental results showed that the proposed model recognition accuracy almost corresponded to the theoretical upper limit obtained by registering all OOVs as recognition lexicons. Furthermore, multiple Markov models have been automatically obtained by splitting OOV characteristics into multiple lower layered models. The use of word-class intrinsic models and automatically derived unsupervised models were proved to be useful for general unspecified OOVs, which gives a guideline of building statistical language models according to the size and the quality of available language data. Less
|
Research Products
(20 results)