Investigation of method optimization for multi-modal speech recognition
Project/Area Number |
25730109
|
Research Category |
Grant-in-Aid for Young Scientists (B)
|
Allocation Type | Multi-year Fund |
Research Field |
Perceptual information processing
|
Research Institution | Gifu University |
Principal Investigator |
|
Project Period (FY) |
2013-04-01 – 2016-03-31
|
Project Status |
Completed (Fiscal Year 2015)
|
Budget Amount *help |
¥3,380,000 (Direct Cost: ¥2,600,000、Indirect Cost: ¥780,000)
Fiscal Year 2015: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
Fiscal Year 2014: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2013: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
|
Keywords | 音声認識 / マルチモーダル情報処理 / 読唇 / 最適化 / 実環境 |
Outline of Final Research Achievements |
For multi-modal speech recognition that uses speech signals and lip images, this research aimed at development of method optimization according to tasks and environments. Effectiveness of incorporating several basic features and applying deep-learning techniques, the optimal architecture of audio-visual integration in addition to effectiveness of stochastic model combination, and improvement of model adaptation were clarified. A robust and high-performance multi-modal speech recognition method was thus developed. The method was applied in various tasks and environments, then recognition improvement was observed and future works were also found.
|
Report
(4 results)
Research Products
(16 results)