2015 Fiscal Year Final Research Report

A Study on Robust Speaker Diarization to Various Speaking Styles for Multi-party Conversations

Research Project

Project/Area Number	25330210
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Research Field	Perceptual information processing
Research Institution	Shizuoka University (2015) Nagoya University (2014) Doshisha University (2013)
Principal Investigator	Nishida Masafumi 静岡大学, 情報学部, 准教授 (80361442)
Co-Investigator(Kenkyū-buntansha)	YAMAMOTO SEIICHI 同志社大学, 理工学部, 教授 (20374100)
Project Period (FY)	2013-04-01 – 2016-03-31
Keywords	多人数会話 / 話者ダイアライゼーション / 発話形式 / 音韻性 / 話者性 / 話者内分散 / 話者間分散
Outline of Final Research Achievements	We proposed a speaker clustering method using Gaussian mixture model in flexibly selected speaker subspace based on variance of intra-utterance in order to realize a robust speaker clustering to various speaking style. We carried out speaker clustering experiments compared with conventional methods based on Bayesian information criterion and Gaussian mixture model in an observation space. The experimental results showed that the proposed method can achieve higher clustering accuracy than conventional methods.
Free Research Field	音声情報処理