2013 Fiscal Year Final Research Report

Analysis and Generation of Speech Conversation by Considering Listener's Reaction

Research Project

Project/Area Number	22240013
Research Category	Grant-in-Aid for Scientific Research (A)
Allocation Type	Single-year Grants
Section	一般
Research Field	Perception information processing/Intelligent robotics
Research Institution	Kyoto University
Principal Investigator	KAWAHARA TATSUYA 京都大学, 学術情報メディアセンター, 教授 (00234104)
Co-Investigator(Kenkyū-buntansha)	SUMI Yasuyuki 公立はこだて未来大学, システム情報科学部, 教授 (30362578) AKITA Yuya 京都大学, 学術情報メディアセンター, 助教 (90402742) MORI Shinsuke 京都大学, 学術情報メディアセンター, 准教授 (90456773)
Project Period (FY)	2010-04-01 – 2014-03-31
Keywords	画像、文章、音声等認識 / コンテンツ・アーカイブ / エージェント / マルチモーダルインターフェース / 音声会話
Research Abstract	A novel approach to analysis of speech communication and design of conversational systems is investigated. It particularly focuses on listener's reaction. In human interactions, listener's reactions such as eye-gaze, backchannels and laughter are detected, and these behavior signals are combined to predict the interest and comprehension level. Moreover, a new type of spoken dialogue systems is developed, which conducts proactive information presentation based on the user's interest and focus.

Research Products
(44 results)

All 2014 2013 2012 2011 2010

All Journal Article (14 results) (of which Peer Reviewed: 14 results) Presentation (30 results)

[Journal Article] Lexicon optimization based on discriminative learning for automatic speech recognition of agglutinative language2014
- Author(s)
  M. Ablimit, T. Kawahara, and A. Hamdulla
- Journal Title
  
  Speech Communication
  
  Volume: Vol.60 Pages: 78-87
- DOI
  10.1016/j.specom.2013.09.011
- Peer Reviewed
[Journal Article] 述語項構造を介した文の選択に基づく音声対話用言語モデルの構築2014
- Author(s)
  吉野幸一郎, 森信介, 河原達也
- Journal Title
  
  人工知能学会論文誌
  
  Volume: Vol.29, No.1 Pages: 53-59
- Peer Reviewed
[Journal Article] 講演に対する読点の複数アノテーションに基づく自動挿入2013
- Author(s)
  秋田祐哉, 河原達也
- Journal Title
  
  情報処理学会論文誌
  
  Volume: Vol.54, No.2 Pages: 463-470
- Peer Reviewed
[Journal Article] A monotonic statistical machine translation approach to speaking style transformation2012
- Author(s)
  G. Neubig, Y. Akita, S. Mori, and T. Kawahara
- Journal Title
  
  Computer Speech and Language
  
  Volume: Vol.26, No.5 Pages: 349-370
- DOI
  10.1016/j.csl.2012.02.003
- Peer Reviewed
[Journal Article] 会議音声認識におけるBICに基づく高速な話者正規化と話者適応2012
- Author(s)
  三村正人, 河原達也
- Journal Title
  
  電子情報通信学会論文誌
  
  Volume: Vol.J95-D, No.7 Pages: 1467-1475
- Peer Reviewed
[Journal Article] Bayesian learning of a language model from continuous speech2012
- Author(s)
  G. Neubig, M. Mimura, S. Mori, and T. Kawahara
- Journal Title
  
  IEICE Trans
  
  Volume: Vol.E95-D, No.2 Pages: 614-625
- Peer Reviewed
[Journal Article] 述語項の類似度に基づく情報抽出・推薦を行う音声対話システム2011
- Author(s)
  吉野幸一郎, 森信介, 河原達也
- Journal Title
  
  情報処理学会論文誌
  
  Volume: Vol.52, No.12 Pages: 3386-3397
- Peer Reviewed
[Journal Article] 音声会話コンテンツにおける聴衆の反応に基づく音響イベントとホットスポットの検出2011
- Author(s)
  河原達也, 須見康平, 緒方淳, 後藤真孝
- Journal Title
  
  情報処理学会論文誌
  
  Volume: Vol.52, No.12 Pages: 3363-3373
- Peer Reviewed
[Journal Article] 統計的言語モデル変換を用いた音響モデルの準教師付き学習2011
- Author(s)
  三村正人, 秋田祐哉, 河原達也
- Journal Title
  
  電子情報通信学会論文誌
  
  Volume: Vol.J94-D, No.2 Pages: 460-468
- Peer Reviewed
[Journal Article] Online unsupervised classification with model comparison in the Variational Bayes framework for voice activity detection2010
- Author(s)
  D. Cournapeau, S. Watanabe, A. Nakamura, and T. Kawahara
- Journal Title
  
  IEEE J. Selected Topics in Signal Processing
  
  Volume: Vol.4, No.6 Pages: 1071-1083
- DOI
  10.1109/JSTSP.2010.2080821
- Peer Reviewed
[Journal Article] 会議録作成支援のための国会審議の音声認識システム2010
- Author(s)
  秋田祐哉, 三村正人, 河原達也
- Journal Title
  
  電子情報通信学会論文誌
  
  Volume: Vol.J93-D, No.9 Pages: 1736-1744
- Peer Reviewed
[Journal Article] Robust speech recognition based on dereverberation parameter optimization using acoustic model likelihood2010
- Author(s)
  R. Gomez and T. Kawahara
- Journal Title
  
  IEEE Trans. Audio, Speech ￥& Language Process
  
  Volume: Vol.18, No.7 Pages: 1708-1716
- DOI
  10.1109/TASL.2010.2052610
- Peer Reviewed
[Journal Article] Statistical transformation of language and pronunciation models for spontaneous speech recognition2010
- Author(s)
  Y. Akita and T. Kawahara
- Journal Title
  
  IEEE Trans. Audio, Speech & Language Process
  
  Volume: Vol.18, No.6 Pages: 1539-1549
- DOI
  10.1109/TASL.2009.2037400
- Peer Reviewed
[Journal Article] Speech activity detection for multi-party conversation analyses based on likelihood ratio test on spatial magnitude2010
- Author(s)
  K. Ishizuka, S. Araki, and T. Kawahara
- Journal Title
  
  IEEE Trans. Audio, Speech & Language Process
  
  Volume: Vol.18, No.6 Pages: 1354-1365
- DOI
  10.1109/TASL.2009.2033955
- Peer Reviewed
[Presentation] Smart posterboard : Multi-modal sensing and analysis of poster conversations2013
- Author(s)
  T. Kawahara
- Organizer
  Proc. APSIPA ASC, (plenary overview talk)
- Place of Presentation
  台湾・高雄
- Year and Date
  20131000
[Presentation] Estimation of interest and comprehension level of audience through multi-modal behaviors in poster conversations2013
- Author(s)
  T. Kawahara, S. Hayashi, and K. Takanashi
- Organizer
  Proc. INTERSPEECH
- Place of Presentation
  フランス・リヨン
- Year and Date
  20130800
[Presentation] Incorporating semantic information to selection of web texts for language model of spoken dialogue system2013
- Author(s)
  K. Yoshino, S. Mori, and T. Kawahara
- Organizer
  Proc. IEEE-ICASSP
- Place of Presentation
  カナダ・バンクーバー
- Year and Date
  20130500
[Presentation] Language modeling for spoken dialogue system based on filtering using predicate-argument structures2012
- Author(s)
  K. Yoshino, S. Mori, and T. Kawahara
- Organizer
  Proc. COLING
- Place of Presentation
  インド・ムンバイ
- Year and Date
  20121200
[Presentation] Hybrid vector space model for flexible voice search2012
- Author(s)
  C. Lee and T. Kawahara
- Organizer
  Proc. APSIPA ASC
- Place of Presentation
  米国・ロサンジェルス
- Year and Date
  20121200
[Presentation] Language modeling for spoken dialogue system based on sentence transformation and filtering using predicate-argument structures2012
- Author(s)
  K. Yoshino, S. Mori, and T. Kawahara
- Organizer
  Proc. APSIPA ASC
- Place of Presentation
  米国・ロサンジェルス
- Year and Date
  20121200
[Presentation] Automatic transcription of lecture speech using language model based on speaking-style transformation of proceeding texts2012
- Author(s)
  Y. Akita, M. Watanabe, and T. Kawahara
- Organizer
  Proc. INTERSPEECH
- Place of Presentation
  米国・ポートランド
- Year and Date
  20120900
[Presentation] Dereverberation based on wavelet packet filtering for robust automatic speech recognition2012
- Author(s)
  R. Gomez and T. Kawahara
- Organizer
  Proc. INTERSPEECH
- Place of Presentation
  米国・ポートランド
- Year and Date
  20120900
[Presentation] Prediction of turn-taking by combining prosodic and eye-gaze information in poster conversations2012
- Author(s)
  T. Kawahara, T. Iwatate, and K. Takanashi
- Organizer
  Proc. INTERSPEECH
- Place of Presentation
  米国・ポートランド
- Year and Date
  20120900
[Presentation] Can we predict who in the audience will ask what kind of questions with their feedback behaviors in poster conversation?2012
- Author(s)
  T. Kawahara, T. Iwatate, T. Tsuchiya, and K. Takanashi
- Organizer
  Proc. Interdisciplinary Workshop on Feedback Behaviors in Dialog
- Place of Presentation
  米国・ポートランド
- Year and Date
  20120900
[Presentation] Transcription system using automatic speech recognition for the Japanese Parliament (Diet)2012
- Author(s)
  T. Kawahara
- Organizer
  Proc. AAAI/IAAI
- Place of Presentation
  カナダ・トロント
- Year and Date
  20120700
[Presentation] Multi-modal sensing and analysis of poster conversations toward smart posterboard2012
- Author(s)
  T. Kawahara
- Organizer
  Proc. SIGdial Meeting Discourse ￥& Dialogue
- Place of Presentation
  韓国・ソウル
- Year and Date
  20120700
[Presentation] Discriminative approach to lexical entry selection for automatic speech recognition of agglutinative language2012
- Author(s)
  M. Ablimit, T. Kawahara, and A. Hamdulla
- Organizer
  Proc. IEEE-ICASSP
- Place of Presentation
  京都
- Year and Date
  20120300
[Presentation] Optimized wavelet-based speech enhancement for speech recognition in noisy and reverberant conditions2011
- Author(s)
  R. Gomez and T. Kawahara
- Organizer
  Proc. APSIPA ASC
- Place of Presentation
  中国・西安
- Year and Date
  20111000
[Presentation] Fast speaker normalization and adaptation based on BIC for meeting speech recognition2011
- Author(s)
  M. Mimura and T. Kawahara
- Organizer
  Proc. APSIPA ASC
- Place of Presentation
  中国・西安
- Year and Date
  20111000
[Presentation] Lexicon optimization for automatic speech recognition based on discriminative learning2011
- Author(s)
  M. Ablimit, T. Kawahara, and A. Hamdulla
- Organizer
  Proc. APSIPA ASC
- Place of Presentation
  中国・西安
- Year and Date
  20111000
[Presentation] Info-concierge : Proactive multi-modal interaction through mind probing2011
- Author(s)
  T. Hirayama, Y. Sumi, T. Kawahara, and T. Matsuyama
- Organizer
  Proc. APSIPA ASC
- Place of Presentation
  中国・西安
- Year and Date
  20111000
[Presentation] Combining slot-based vector space model for voice book search2011
- Author(s)
  C. Lee, T. Kawahara, and A. Rudnicky
- Organizer
  Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS)
- Place of Presentation
  スペイン・グラナダ
- Year and Date
  20110900
[Presentation] Automatic comma insertion of lecture transcripts based on multiple annotations2011
- Author(s)
  Y. Akita and T. Kawahara
- Organizer
  Proc. INTERSPEECH
- Place of Presentation
  イタリア・フィレンチェ
- Year and Date
  20110800
[Presentation] Denoising using optimized wavelet filtering for automatic speech recognition2011
- Author(s)
  R. Gomez and T. Kawahara
- Organizer
  Proc. INTERSPEECH
- Place of Presentation
  イタリア・フィレンチェ
- Year and Date
  20110800
[Presentation] Spoken dialogue system based on information extraction using similarity of predicate argument structures2011
- Author(s)
  K. Yoshino, S. Mori, and T. Kawahara
- Organizer
  Proc. SIGdial Meeting Discourse ￥& Dialogue
- Place of Presentation
  米国・ポートランド
- Year and Date
  20110600
[Presentation] Optimizing wavelet parameters for dereverberation in automatic speech recognition2010
- Author(s)
  R. Gomez and T. Kawahara
- Organizer
  Proc. APSIPA ASC
- Place of Presentation
  シンガポール
- Year and Date
  20101200
[Presentation] Automatic transcription of parliamentary meetings and classroom lectures - a sustainable approach and real system evaluations -2010
- Author(s)
  T. Kawahara
- Organizer
  Proc. Int'l Sympo. Chinese Spoken Language Processing (ISCSLP)
- Place of Presentation
  台湾・台南
- Year and Date
  20101200
[Presentation] Spoken dialogue system based on information extraction from web text2010
- Author(s)
  K. Yoshino and T. Kawahara
- Organizer
  Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS)
- Place of Presentation
  御殿場
- Year and Date
  20100900
[Presentation] Detection of hot spots in poster conversations based on reactive tokens of audience2010
- Author(s)
  T. Kawahara, K. Sumi, Z.Q.Chang, and K. Takanashi
- Organizer
  Proc. INTERSPEECH
- Place of Presentation
  幕張
- Year and Date
  20100900
[Presentation] Learning a language model from continuous speech2010
- Author(s)
  G. Neubig, M. Mimura, S. Mori, and T. Kawahara
- Organizer
  Proc. INTERSPEECH
- Place of Presentation
  幕張
- Year and Date
  20100900
[Presentation] Classroom note-taking system for hearing impaired students using automatic speech recognition adapted to lectures2010
- Author(s)
  T. Kawahara, N. Katsumaru, Y. Akita, and S. Mori
- Organizer
  Proc. INTERSPEECH
- Place of Presentation
  幕張
- Year and Date
  20100900
[Presentation] An improved wavelet-based dereverberation for robust automatic speech recognition2010
- Author(s)
  R. Gomez and T. Kawahara
- Organizer
  Proc. INTERSPEECH
- Place of Presentation
  幕張
- Year and Date
  20100900
[Presentation] Semi-automated update of automatic transcription system for the Japanese national congress2010
- Author(s)
  Y. Akita, M. Mimura, G. Neubig, and T. Kawahara
- Organizer
  Proc. INTERSPEECH
- Place of Presentation
  幕張
- Year and Date
  20100900
[Presentation] Analysis on prosodic features of Japanese reactive tokens in poster conversations2010
- Author(s)
  T. Kawahara, Z.Q. Chang, and K. Takanashi
- Organizer
  Proc. Int'l Conf. Speech Prosody
- Place of Presentation
  米国・シカゴ
- Year and Date
  20100500

2013 Fiscal Year Final Research Report

Analysis and Generation of Speech Conversation by Considering Listener's Reaction

Principal Investigator

KAWAHARA TATSUYA 京都大学, 学術情報メディアセンター, 教授 (00234104)

Research Products

[Journal Article] Lexicon optimization based on discriminative learning for automatic speech recognition of agglutinative language2014

Author(s)

Journal Title

DOI

[Journal Article] 述語項構造を介した文の選択に基づく音声対話用言語モデルの構築2014

Author(s)

Journal Title

[Journal Article] 講演に対する読点の複数アノテーションに基づく自動挿入2013

Author(s)

Journal Title

[Journal Article] A monotonic statistical machine translation approach to speaking style transformation2012

Author(s)

Journal Title

DOI

[Journal Article] 会議音声認識におけるBICに基づく高速な話者正規化と話者適応2012

Author(s)

Journal Title

[Journal Article] Bayesian learning of a language model from continuous speech2012

Author(s)

Journal Title

[Journal Article] 述語項の類似度に基づく情報抽出・推薦を行う音声対話システム2011

Author(s)

Journal Title

[Journal Article] 音声会話コンテンツにおける聴衆の反応に基づく 音響イベントとホットスポットの検出2011

Author(s)

Journal Title

[Journal Article] 統計的言語モデル変換を用いた音響モデルの準教師付き学習2011

Author(s)

Journal Title

[Journal Article] Online unsupervised classification with model comparison in the Variational Bayes framework for voice activity detection2010

Author(s)

Journal Title

DOI

[Journal Article] 会議録作成支援のための国会審議の音声認識システム2010

Author(s)

Journal Title

[Journal Article] Robust speech recognition based on dereverberation parameter optimization using acoustic model likelihood2010

Author(s)

Journal Title

DOI

[Journal Article] Statistical transformation of language and pronunciation models for spontaneous speech recognition2010

Author(s)

Journal Title

DOI

[Journal Article] Speech activity detection for multi-party conversation analyses based on likelihood ratio test on spatial magnitude2010

Author(s)

Journal Title

DOI

[Presentation] Smart posterboard : Multi-modal sensing and analysis of poster conversations2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Estimation of interest and comprehension level of audience through multi-modal behaviors in poster conversations2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Incorporating semantic information to selection of web texts for language model of spoken dialogue system2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Language modeling for spoken dialogue system based on filtering using predicate-argument structures2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Hybrid vector space model for flexible voice search2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Language modeling for spoken dialogue system based on sentence transformation and filtering using predicate-argument structures2012

Author(s)

[Journal Article] 音声会話コンテンツにおける聴衆の反応に基づく音響イベントとホットスポットの検出2011