Project/Area Number |
22500144
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Yamagata University |
Principal Investigator |
KOSAKA Tetsuo 山形大学, 大学院・理工学研究科, 教授 (50359569)
|
Co-Investigator(Renkei-kenkyūsha) |
KATO Masaharu 山形大学, 大学院・理工学研究科, 助教 (10250953)
|
Project Period (FY) |
2010 – 2012
|
Project Status |
Completed (Fiscal Year 2012)
|
Budget Amount *help |
¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000)
Fiscal Year 2012: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2011: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2010: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
|
Keywords | 音声認識 / 話し言葉 / 音響モデル / 言語モデル / 話者適応 / 話し言葉音声認識 / 教師無し話者適応 / 単語グラフ統合 / クロスバリデーション / 話者インデキシング / 話者ベクトル / クロス適応 / 音素環境依存モデル / 話者クラス音響モデル |
Research Abstract |
In our research, we aimed to improve the system performance for recognizing spontaneousspeech, which was considered to be more difficult than recognizing read speech. We focused on three technical issues: (1) acoustic and language models, (2) system combinationtechniques, and (3) speaker indexing. For improving the performance of acoustic models,we investigated a discrete-mixture hidden Markov model based on discriminative training, speaker-class model, quinphone, and a reverberation-class model. Some systemco(a) mbinationtechniquesw(a) ere investigated, such as the combination of continuous anddiscrete models, the combination of various quinphones, and the combination of reverberation-class models. For the issues of language models, we proposed the cross adaptation and cross-validation adaptation techniques. In addition, we improved theperformance of speaker indexing techniques based on speaker vectors required during theexecution of speaker adaptation.
|