2010 Fiscal Year Final Research Report
A study of multimodal recognition for human communication search
Project/Area Number |
20300063
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Tokyo Institute of Technology |
Principal Investigator |
SHINODA Koichi Tokyo Institute of Technology, 大学院・情報理工学研究科, 准教授 (10343097)
|
Co-Investigator(Kenkyū-buntansha) |
FURUI Sadaoki 東京工業大学, 大学院・情報理工学研究科, 教授 (90293076)
|
Project Period (FY) |
2008 – 2010
|
Keywords | 音声認識 / 動画像認識 / マルチモーダル認識 / ヒューマンコミュニケーション理解 / 情報検索 |
Research Abstract |
We developed multimodal pattern recognition techniques for human communication using speech and video. We proposed a statistical technique using Gaussian mixture models and support vector machines for event extraction. We participated in TRECVID2010 workshop, where our method achieved the 4-th performance among 40 participants from all over the world. We also developed new methods for active learning for speech modeling and adaptation, noise robust speech recognition, signal processing for meeting speech recognition, multimodal pattern recognition, speaker/gesture recognition, speech style analysis and video summarization.
|
-
-
-
-
-
-
-
-
-
-
-
-
[Presentation]2010
Author(s)
Nakamasa Inoue, Toshiya Wada、Yusuke Kamishima、Koichi Shinoda、Ilseo Kim、Byungki Byun, Chin-Hui Lee
Organizer
TT+GT at TRECVID 2010 Workshop, TRECVTD 2010 workshop
Place of Presentation
Gaithersburg
Year and Date
2010-11-15
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-