2014 Fiscal Year Final Research Report

Analysis and synthesis method of phonetic/emotional information in audio-visual speech information

Research Project

PDF

Project/Area Number	24650100
Research Category	Grant-in-Aid for Challenging Exploratory Research
Allocation Type	Multi-year Fund
Research Field	Sensitivity informatics/Soft computing
Research Institution	Tohoku University
Principal Investigator	SUZUKI Yo-iti 東北大学, 電気通信研究所, 教授 (20143034)
Co-Investigator(Kenkyū-buntansha)	KAWASE Tetsuaki 東北大学, 大学院医工学研究科, 教授 (50169728) SAKAMOTO Shuichi 東北大学, 電気通信研究所, 准教授 (60332524)
Project Period (FY)	2012-04-01 – 2015-03-31
Keywords	視聴覚音声知覚 / マルチモーダルインタフェース / 感性情報処理
Outline of Final Research Achievements	Moving images of a talker's face carry much information for speech understanding. Interpretation of that information is known as lip-reading. For the development of advanced multi-modal communications systems, such information should be well considered. To aim at developing such systems, we have focused on the relationship between speech sound information and moving image of talker's face. In this study, we have been particularly examining which parts of moving image of talker's face contribute most to speech understanding. We performed audio-visual speech intelligibility tests and investigated the relationship between speech intelligibility and effects of the parts of moving image of talker's face. Results of the experiments indicated that the mouth area alone provides sufficient information for speech intelligibility. The results suggested that the cue of lip-reading around the mouth might be able to generate from speech sound information.
Free Research Field	情報学