Improvement of Very Large Vocabulary Speech Recognition using an encoding based on probabilistic structure of vocabulary
Project/Area Number |
20500166
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Hosei University |
Principal Investigator |
ITOU Katunobu Hosei University, 情報科学部, 教授 (30356472)
|
Project Period (FY) |
2008 – 2010
|
Project Status |
Completed (Fiscal Year 2010)
|
Budget Amount *help |
¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000)
Fiscal Year 2010: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2009: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2008: ¥2,080,000 (Direct Cost: ¥1,600,000、Indirect Cost: ¥480,000)
|
Keywords | 音声認識 / 話者認識 / 音響ライフログ / 話者識別 / ライフログ / 音声強調 / 音声インタフェース |
Research Abstract |
For speech recognition, in a large vocabulary task, any phone sequence didn't induce statistically significant deficient performance without contribution of language models. For speaker recognition/verification, on the other hand, it seems to be difficult other than increasing of training data. Most applications of speaker recognition, it cannot be expected sufficient training data. Moreover, it is difficult to assume a target phone sequence in advance. Therefore, a new method is required for speaker recognition, because many previous methods for improving speech recognition cannot be efficient for speaker recognition.
|
Report
(4 results)
Research Products
(27 results)