A Study on Robust Speaker Diarization to Various Speaking Styles for Multi-party Conversations

Research Project

Project/Area Number	25330210
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Research Field	Perceptual information processing
Research Institution	Shizuoka University (2015) Nagoya University (2014) Doshisha University (2013)
Principal Investigator	Nishida Masafumi 静岡大学, 情報学部, 准教授 (80361442)
Co-Investigator(Kenkyū-buntansha)	YAMAMOTO SEIICHI 同志社大学, 理工学部, 教授 (20374100)
Project Period (FY)	2013-04-01 – 2016-03-31
Project Status	Completed (Fiscal Year 2015)
Budget Amount *help	¥2,860,000 (Direct Cost: ¥2,200,000、Indirect Cost: ¥660,000) Fiscal Year 2015: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000) Fiscal Year 2014: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000) Fiscal Year 2013: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
Keywords	多人数会話 / 話者ダイアライゼーション / 発話形式 / 音韻性 / 話者性 / 話者内分散 / 話者間分散 / 話者クラスタリング / 音韻性と話者性 / 主成分分析 / GMM / 講演音声 / 話者認識 / 話題内容 / 発話動作
Outline of Final Research Achievements	We proposed a speaker clustering method using Gaussian mixture model in flexibly selected speaker subspace based on variance of intra-utterance in order to realize a robust speaker clustering to various speaking style. We carried out speaker clustering experiments compared with conventional methods based on Bayesian information criterion and Gaussian mixture model in an observation space. The experimental results showed that the proposed method can achieve higher clustering accuracy than conventional methods.

Report

(4 results)

2015 Annual Research Report Final Research Report ( PDF )
2014 Research-status Report
2013 Research-status Report

Research Products
(24 results)

All 2016 2015 2014 2013

All Journal Article (4 results) (of which Peer Reviewed: 4 results, Open Access: 3 results, Acknowledgement Compliant: 1 results) Presentation (20 results) (of which Int'l Joint Research: 4 results, Invited: 1 results)

[Journal Article] Speech Recognition of English by Japanese using Lexicon Represented by Multiple Reduced Phoneme Sets2015
- Author(s)
  X. Wang, S. Yamamoto
- Journal Title
  
  Trans. IEICE
  
  Volume: Vol.E98-D, No. 12 Pages: 2271-2279
- NAID
  130005112316
- Related Report
  2015 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Multimodal corpus of multiparty conversations in L1 and L2 languages and findings obtained from it2015
- Author(s)
  Yamamoto, S., Taguchi, K., Ijuin, K., Umata, I., and Nishida, M.
- Journal Title
  
  Language Resources and Evaluation
  
  Volume: 49 Issue: 4 Pages: 857-882
- DOI
  10.1007/s10579-015-9299-2
- Related Report
  2014 Research-status Report
- Peer Reviewed / Open Access / Acknowledgement Compliant
[Journal Article] Automatic Induction of Romanization Systems from Bilingual Corpora2015
- Author(s)
  K. Taguchi, A. Finch, S. Yamamoto, E. Sumita
- Journal Title
  
  IEICE Transactions on Information and Systems
  
  Volume: E98.D Issue: 2 Pages: 381-393
- DOI
  10.1587/transinf.2014EDP7236
- NAID
  130004841828
- ISSN
  0916-8532, 1745-1361
- Related Report
  2014 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Gaze and Turn-Taking Behavior in Casual Conversational Interactions2013
- Author(s)
  K. Jokinen, H. Furukawa, M. Nishida, S. Yamamoto
- Journal Title
  
  ACM Transactions on Interactive Intelligent Systems
  
  Volume: 3 Issue: 2 Pages: 1-30
- DOI
  10.1145/2499474.2499481
- Related Report
  2013 Research-status Report
- Peer Reviewed
[Presentation] 多元的音情報に基づく頑健な音声認識に関する研究2016
- Author(s)
  林升柯，西田昌史，西村雅史
- Organizer
  日本音響学会春季研究発表会
- Place of Presentation
  横浜桐蔭大学（神奈川県横浜市）
- Year and Date
  2016-03-09
- Related Report
  2015 Annual Research Report
[Presentation] 非侵襲簡易型身体状況認識システムに関する研究2016
- Author(s)
  安藤純平，西田昌史，西村雅史
- Organizer
  日本音響学会春季研究発表会
- Place of Presentation
  横浜桐蔭大学（神奈川県横浜市）
- Year and Date
  2016-03-09
- Related Report
  2015 Annual Research Report
[Presentation] Daily Activity Recognition Based on Acoustic Signals and Acceleration Signals Estimated with Gaussian Process2015
- Author(s)
  M. Nishida, N. Kitaoka, K. Takeda
- Organizer
  APSIPA
- Place of Presentation
  Hong Kong (China)
- Year and Date
  2015-12-16
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] 咽喉マイクを利用した多人数会話における発話区間推定2015
- Author(s)
  大高祥裕，西田昌史，西村雅史
- Organizer
  第13回情報学ワークショップ
- Place of Presentation
  名城大学（愛知県名古屋市）
- Year and Date
  2015-12-05
- Related Report
  2015 Annual Research Report
[Presentation] Quantitative analyses of Gaze Activity during Silence: Comparison between Native-language and Second-language Conversations2015
- Author(s)
  I. Umata, T. Tanizoe, K. Ijuin, S. Yamamoto
- Organizer
  EAP Cogsci
- Place of Presentation
  Torino (Italy)
- Year and Date
  2015-09-25
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] Eye Gaze Analyses in L1 and L2 Conversations: Difference in Interaction Structure2015
- Author(s)
  K. Ijuin, Y. Horiuchi, I. Umata, S. Yamamoto
- Organizer
  TSD
- Place of Presentation
  Plzen (Czech)
- Year and Date
  2015-09-14
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] Daily Activity Recognition Based on DNN Using Environmental Sound and Acceleration Signals2015
- Author(s)
  T. Hayashi, M. Nishida, N. Kitaoka, K. Takeda
- Organizer
  EUSIPCO
- Place of Presentation
  Nice (France)
- Year and Date
  2015-08-31
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] 日本語母語話者による第二言語音声を対象にした話者認識2015
- Author(s)
  阿部将和，西田昌史，山本誠一
- Organizer
  音響学会2015年春季研究発表会
- Place of Presentation
  中央大学（東京都文京区）
- Year and Date
  2015-03-16 – 2015-03-18
- Related Report
  2014 Research-status Report
[Presentation] 第二言語での少人数会話における聞き手の視線動作の分析2015
- Author(s)
  伊集院幸輝，田口惠子，馬田一郎，山本誠一
- Organizer
  電子情報通信学会総合大会
- Place of Presentation
  立命館大学（滋賀県草津市）
- Year and Date
  2015-03-10 – 2015-03-13
- Related Report
  2014 Research-status Report
[Presentation] 複数人会話における対話者への視線の自動推定2015
- Author(s)
  荒木智彰，山本誠一
- Organizer
  電子情報通信学会総合大会
- Place of Presentation
  立命館大学（滋賀県草津市）
- Year and Date
  2015-03-10 – 2015-03-13
- Related Report
  2014 Research-status Report
[Presentation] 第二言語習熟度による母語と第二言語間の視線動作の相違分析2015
- Author(s)
  堀内保大，伊集院幸輝，田口惠子，山本誠一
- Organizer
  電子情報通信学会総合大会
- Place of Presentation
  立命館大学（滋賀県草津市）
- Year and Date
  2015-03-10 – 2015-03-13
- Related Report
  2014 Research-status Report
[Presentation] 講演音声における発話形式を考慮した話者認識手法の検討2014
- Author(s)
  中辻康太，西田昌史，山本誠一
- Organizer
  第16回音声言語シンポジウム
- Place of Presentation
  東京工業大学（神奈川県横浜市）
- Year and Date
  2014-12-15 – 2014-12-16
- Related Report
  2014 Research-status Report
[Presentation] Eye Gaze Analyses in L1 and L2 Conversations: From the Perspective of Listeners’ Eye Gaze Activity2014
- Author(s)
  K. Ijuin, K. Taguchi, I. Umata, S. Yamamoto
- Organizer
  UMMMI-ICMI
- Place of Presentation
  Istanbul (Turkey)
- Year and Date
  2014-11-16
- Related Report
  2014 Research-status Report
[Presentation] Multimodal Japanese Corpus of Multi-party Conversation on Two Different Topic Types2014
- Author(s)
  K. Taguchi, K. Ijuin, I. Umata, S. Yamamoto
- Organizer
  Oriental COCOSDA
- Place of Presentation
  Phuket (Thailand)
- Year and Date
  2014-09-10 – 2014-09-12
- Related Report
  2014 Research-status Report
[Presentation] マルチモーダルコーパスを用いた母語と第二言語の沈黙時の視線行動の相違分析2014
- Author(s)
  馬田一郎，田口惠子，伊集院幸輝，山本誠一
- Organizer
  第13回情報科学技術フォーラム
- Place of Presentation
  筑波大学（茨城県つくば市）
- Year and Date
  2014-09-03 – 2014-09-05
- Related Report
  2014 Research-status Report
[Presentation] 少人数会話での注視対象に関する比較分析2014
- Author(s)
  伊集院幸輝，田口惠子，山本誠一，馬田一郎
- Organizer
  第13回情報科学技術フォーラム
- Place of Presentation
  筑波大学（茨城県つくば市）
- Year and Date
  2014-09-03 – 2014-09-05
- Related Report
  2014 Research-status Report
[Presentation] 第二言語による少人数会話での話題内容の発話動作への影響分析2014
- Author(s)
  野本沙斗子
- Organizer
  電子情報通信学会総合大会
- Place of Presentation
  新潟大学
- Related Report
  2013 Research-status Report
[Presentation] Gaze and Turn-Taking Behaviorin Casual Conversational Interactions2014
- Author(s)
  Masafumi Nishida
- Organizer
  International Conference on Intelligent User Interfaces
- Place of Presentation
  Haifa, Israel
- Related Report
  2013 Research-status Report
- Invited
[Presentation] Effects of Language Proficiency on Eye-gaze in Second Language Conversations: Toward Supporting Second Language Collaboration2013
- Author(s)
  Ichiro Umata
- Organizer
  International Conference on Multimodal Interaction
- Place of Presentation
  Sydney, Australia
- Related Report
  2013 Research-status Report
[Presentation] Differences in Interactional Attitudes in Native and Second Language Conversations: Quantitative Analyses of Multimodal Three-Party Corpus2013
- Author(s)
  Keiko Taguchi
- Organizer
  Annual Meeting of the Cognitive Science Society
- Place of Presentation
  Berlin, Germany
- Related Report
  2013 Research-status Report

A Study on Robust Speaker Diarization to Various Speaking Styles for Multi-party Conversations

Principal Investigator

Nishida Masafumi 静岡大学, 情報学部, 准教授 (80361442)

¥2,860,000 (Direct Cost: ¥2,200,000、Indirect Cost: ¥660,000)

Report

Research Products

[Journal Article] Speech Recognition of English by Japanese using Lexicon Represented by Multiple Reduced Phoneme Sets2015

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Multimodal corpus of multiparty conversations in L1 and L2 languages and findings obtained from it2015

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Automatic Induction of Romanization Systems from Bilingual Corpora2015

Author(s)

Journal Title

DOI

NAID

ISSN

Related Report

[Journal Article] Gaze and Turn-Taking Behavior in Casual Conversational Interactions2013

Author(s)

Journal Title

DOI

Related Report

[Presentation] 多元的音情報に基づく頑健な音声認識に関する研究2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 非侵襲簡易型身体状況認識システムに関する研究2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Daily Activity Recognition Based on Acoustic Signals and Acceleration Signals Estimated with Gaussian Process2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 咽喉マイクを利用した多人数会話における発話区間推定2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Quantitative analyses of Gaze Activity during Silence: Comparison between Native-language and Second-language Conversations2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Eye Gaze Analyses in L1 and L2 Conversations: Difference in Interaction Structure2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Daily Activity Recognition Based on DNN Using Environmental Sound and Acceleration Signals2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 日本語母語話者による第二言語音声を対象にした話者認識2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 第二言語での少人数会話における聞き手の視線動作の分析2015

Author(s)

Organizer

Place of Presentation