2015 Fiscal Year Final Research Report

A study on custom-made augmented speech communication system based on higher-order statistics pursuit

Research Project

PDF

Project/Area Number	23240023
Research Category	Grant-in-Aid for Scientific Research (A)
Allocation Type	Single-year Grants
Section	一般
Research Field	Perception information processing/Intelligent robotics
Research Institution	The University of Tokyo (2014-2015) Nara Institute of Science and Technology (2011-2013)
Principal Investigator	Saruwatari Hiroshi 東京大学, 情報理工学(系)研究科, 教授 (30324974)
Co-Investigator(Kenkyū-buntansha)	SHIKANO Kiyohiro 奈良先端科学技術大学院大学, 情報科学研究科, 名誉教授 (00263426) TODA Tomoki 奈良先端科学技術大学院大学, 情報科学研究科, 准教授 (90403328) KAWANAMI Hiromichi 奈良先端科学技術大学院大学, 情報科学研究科, 助教 (80335489) ONO Nobutaka 国立情報学研究所, 情報学プリンシプル研究系, 准教授 (80334259) MIYABE Shigeki 筑波大学, 大学院システム情報工学研究科, 助教 (50598745) MAKINO Shoji 筑波大学, 大学院システム情報工学研究科, 教授 (60396190) KOYAMA Shoichi 東京大学, 大学院情報理工学系研究科, 助教 (80734459)
Project Period (FY)	2011-04-01 – 2016-03-31
Keywords	音声情報処理 / 統計的学習理論 / 音響信号処理
Outline of Final Research Achievements	In this study, we address an unsupervised custom-made augmented speech communication system based on the higher-order statistics pursuit. This system consists of two parts, namely, a binaural hearing aid using blind source separation and a speaking aid via speech conversion. The following results are obtained. (1) As the binaural hearing-aid system, we propose new algorithms for an accurate and fast blind source separation and statistical speech conversion, yielding a high quality speech enhancement system utilizing a fixed point of auditory perception. (2) As the speaking-aid system, a new robust speech conversion algorithm against a mismatch between speech database is proposed. The evaluation using real-world sound database shows the efficacy of the proposed method.
Free Research Field	知能情報処理