2016 Fiscal Year Final Research Report

A Unified Bayesian Approach to Simultaneous Speech Recognition for Mixture Signals

Research Project

Project/Area Number	15K12063
Research Category	Grant-in-Aid for Challenging Exploratory Research
Allocation Type	Multi-year Fund
Research Field	Perceptual information processing
Research Institution	Kyoto University
Principal Investigator	Yoshii Kazuyoshi 京都大学, 情報学研究科, 講師 (20510001)
Co-Investigator(Renkei-kenkyūsha)	KAWAHARA Tatsuya 京都大学, 大学院情報学研究科, 教授 (00234104) MOCHIHASHI Daichi 統計数理研究所, モデリング研究系, 准教授 (80418508)
Project Period (FY)	2015-04-01 – 2017-03-31
Keywords	音源分離 / 音声認識 / 確率モデル / ベイズモデル / MCMC
Outline of Final Research Achievements	We proposed a method that can simultaneously recognize multiple utterances by using a probabilictic model of source separation. Since there is uncertainty about source signals, we combined speech recognition with source separation by considering the posterior distributin of the source signals. This enabled us to obtain recognition results directly from mixture signals without uniquely determining the source signals. In addition, we proposed a source separation method based on an integrated model involving a source model and a superimposition model. Each model is represented as a mixture (LDA) or factor model (NMF) and the performance of each combination was evaluated.
Free Research Field	統計的音響信号処理