2010 Fiscal Year Final Research Report

A Method for Speech Analysis Based on a Visual-to-Auditory Feedback Mechanism

Research Project

Project/Area Number	21680016
Research Category	Grant-in-Aid for Young Scientists (A)
Allocation Type	Single-year Grants
Research Field	Perception information processing/Intelligent robotics
Research Institution	Kyoto University
Principal Investigator	KAWASHIMA Hiroaki Kyoto University, 情報学研究科, 講師 (40346101)
Project Period (FY)	2009 – 2010
Keywords	音声推定・分離 / 口唇運動 / 線形システム / ハイブリッドシステム / タイミング構造 / 視聴覚統合 / マルチモダリティ / 時系列の分節化
Research Abstract	We have developed a novel speech-analysis method based on the detail modeling of temporal relationship between mouth movements and speech signals. First, we use a hybrid system, which is an integrated model of dynamical systems and discrete-event systems, as a mathematical tool to segment and model multimedia signals such as captured mouth motion and speech data. Then, we build a statistical cross-media timing model that can be learned from those segmented data. The proposed method realizes the mechanism of signal generation "from mouth motion to speech", which enables highly accurate speech estimation in non-stationary noise environment.