A spoken dialogue system based on estimation of mental state using multimodal processing

Research Project

Project/Area Number	15J07337
Research Category	Grant-in-Aid for JSPS Fellows
Allocation Type	Single-year Grants
Section	国内
Research Field	Perceptual information processing
Research Institution	Kyoto University
Principal Investigator	井上昂治京都大学, 情報学研究科, 特別研究員(DC1)
Project Period (FY)	2015-04-24 – 2018-03-31
Project Status	Completed (Fiscal Year 2017)
Budget Amount *help	¥2,800,000 (Direct Cost: ¥2,800,000) Fiscal Year 2017: ¥900,000 (Direct Cost: ¥900,000) Fiscal Year 2016: ¥900,000 (Direct Cost: ¥900,000) Fiscal Year 2015: ¥1,000,000 (Direct Cost: ¥1,000,000)
Keywords	音声対話システム / マルチモーダル / 心的状態 / エンゲージメント / 対話システム
Outline of Annual Research Achievements	人間とロボットとの対話において、ユーザである人間が抱く対話に対する興味や意欲といった心的状態の自動推定に取り組んだ。興味や意欲の指標として、対話エンゲージメントという概念を用いた。一方的にシステムが話すのではなく、ユーザの様子や態度に合わせて、柔軟に変化する対話システムを実現するためには、対話エンゲージメントの自動推定は不可欠な要素である。はじめに、対話データに対して、複数の評定者にエンゲージメントの度合いを判断してもらうことで正解データを作成した。次に、エンゲージメントの推定モデルとして、キャラクタという評定者の特性を考慮する統計的モデルを提案した。ただし、入力として、ユーザの聞き手のふるまいを用いた。具体的には、相槌、笑い、うなずき、視線のふるまいである。エンゲージメントの判断は主観的であるため、評定者によって判断結果が異なることがある。そこで、このキャラクタという潜在変数を導入することで、各評定者の判断結果をより精緻に推定することが可能になった。このモデルは、感情認識などの主観性を伴う他の認識タスクにも適用することができる。エンゲージメント推定モデルを、実際の対話ロボットで利用するために、リアルタイム推定の実現にも取り組んだ。具体的には、深層学習技術を用いて、入力であるふるまいの自動検出モデルを実装し、これをエンゲージメント推定モデルと統合した。その結果、エンゲージメント推定の精度を保ちつつ、リアルタイムでの推定が可能であることを確認した。最後に、リアルタイムエンゲージメント推定システムを自律型アンドロイドＥＲＩＣＡの対話システム上へ実装し、研究室紹介のデモンストレーションを行った。ここでは、アンドロイドＥＲＩＣＡが研究室について紹介をするという設定で、来訪者であるユーザのエンゲージメントを自動的に推定し、これに応じてＥＲＩＣＡのふるまいを動的に変化させるものである。
Research Progress Status	29年度が最終年度であるため、記入しない。
Strategy for Future Research Activity	29年度が最終年度であるため、記入しない。

Report

(3 results)

Research Products

(28 results)

All 2018 2017 2016 2015

All Journal Article (4 results) (of which Peer Reviewed: 4 results, Open Access: 2 results, Acknowledgement Compliant: 1 results) Presentation (24 results) (of which Int'l Joint Research: 7 results)

[Journal Article] Engagement Recognition from Listener’s Behaviors in Spoken Dialogue Using a Latent Character Model2018
- Author(s)
  井上昂治, Divesh Lala, 吉井和佳, 高梨克也, 河原達也
- Journal Title
  
  Transactions of the Japanese Society for Artificial Intelligence
  
  Volume: 33 Issue: 1 Pages: DSH-F_1-12
- DOI
  10.1527/tjsai.DSH-F
- NAID
  130006302231
- ISSN
  1346-0714, 1346-8030
- Related Report
  2017 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Speaker Diarization by Combining Acoustic and Eye-Gaze Information in Multi-Party Conversations2016
- Author(s)
  井上昂治, 若林佑幸, 吉本廣雅, 河原達也
- Journal Title
  
  電子情報通信学会論文誌D 情報・システム
  
  Volume: J99-D Issue: 3 Pages: 348-357
- DOI
  10.14923/transinfj.2015JDP7024
- ISSN
  1880-4535, 1881-0225
- Year and Date
  2016-03-01
- Related Report
  2015 Annual Research Report
- Peer Reviewed / Acknowledgement Compliant
[Journal Article] Speaker Diarization and Source Number Estimation Based on Audio-Visual Integration2016
- Author(s)
  若林佑幸, 井上昂治, 中山雅人, 西浦敬信, 山下洋一, 吉本廣雅, 河原達也
- Journal Title
  
  電子情報通信学会論文誌D 情報・システム
  
  Volume: J99-D Issue: 3 Pages: 326-336
- DOI
  10.14923/transinfj.2015PDP0006
- ISSN
  1880-4535, 1881-0225
- Year and Date
  2016-03-01
- Related Report
  2015 Annual Research Report
- Peer Reviewed
[Journal Article] Multi-modal sensing and analysis of poster conversations with smart posterboard2016
- Author(s)
  Tatsuya Kawahara, Takuma Iwatate, Koji Inoue, Soichiro Hayashi, Hiromasa Yoshimoto, Katsuya Takanashi
- Journal Title
  
  APSIPA Transactions on Signal and Information Processing
  
  Volume: 5 Issue: 1 Pages: 1-12
- DOI
  10.1017/atsip.2016.2
- Related Report
  2015 Annual Research Report
- Peer Reviewed / Open Access
[Presentation] Latent character model for engagement recognition based on multimodal behaviors2018
- Author(s)
  Koji Inoue, Divesh Lala, Katsuya Takanashi, Tatsuya Kawahara
- Organizer
  International workshop on spoken dialogue systems
- Related Report
  2017 Annual Research Report
- Int'l Joint Research
[Presentation] 自律型アンドロイドERICAにおけるエンゲージメント推定に基づく音声対話システム2018
- Author(s)
  井上昂治, Lala Divesh, 高梨克也, 河原達也
- Organizer
  日本音響学会2018年春季研究発表会
- Related Report
  2017 Annual Research Report
[Presentation] Detection of social signals for recognizing engagement in human-robot interaction2017
- Author(s)
  Divesh Lala, Koji Inoue, Pierrick Milhorat, Tatsuya Kawahara
- Organizer
  AAAI fall symposium series, symposium on natural communication for human-robot collaboration
- Related Report
  2017 Annual Research Report
- Int'l Joint Research
[Presentation] 潜在キャラクタモデルによるリアルタイム対話エンゲージメント推定2017
- Author(s)
  井上昂治, Lala Divesh, Milhorat Pierrick, 高梨克也, 河原達也
- Organizer
  人工知能学会言語・音声理解と対話処理研究会
- Related Report
  2017 Annual Research Report
[Presentation] 自律型アンドロイドERICAにおける多様な聞き手応答を用いた傾聴対話2017
- Author(s)
  井上昂治, Lala Divesh, Milhorat Pierrick, 石田真也, 趙天雨, 高梨克也, 河原達也
- Organizer
  人工知能学会言語・音声理解と対話処理研究会
- Related Report
  2017 Annual Research Report
[Presentation] 潜在キャラクタモデルによる聞き手のふるまいに基づく対話エンゲージメントの推定2017
- Author(s)
  井上昂治, Lala Divesh, 吉井和佳, 高梨克也, 河原達也
- Organizer
  日本音響学会2017年秋季研究発表会
- Related Report
  2017 Annual Research Report
[Presentation] DAEを用いたリアルタイム遠隔音声認識2017
- Author(s)
  井上昂治, 三村正人, 石井カルロス寿憲, 坂井信輔, 河原達也
- Organizer
  日本音響学会 2017年春季研究発表会
- Place of Presentation
  明治大学(神奈川県・川崎市)
- Related Report
  2016 Annual Research Report
[Presentation] 聞き手の多様なふるまいに基づく対話エンゲージメントの推定2017
- Author(s)
  井上昂治, Lala Divesh, 高梨克也, 河原達也
- Organizer
  日本音響学会 2017年春季研究発表会
- Place of Presentation
  明治大学(神奈川県・川崎市)
- Related Report
  2016 Annual Research Report
[Presentation] 傾聴対話システムのための多様な聞き手応答の生成2016
- Author(s)
  石田真也, 井上昂治, 中村静, 高梨克也, 河原達也
- Organizer
  情報処理学会第78回全国大会
- Place of Presentation
  慶應義塾大学（神奈川県・横浜市）
- Year and Date
  2016-03-12
- Related Report
  2015 Annual Research Report
[Presentation] 初対面対話における場の和みのマルチモーダルな分析と検出2016
- Author(s)
  稲熊寛文, 井上昂治, 中村静, 高梨克也, 河原達也
- Organizer
  情報処理学会第78回全国大会
- Place of Presentation
  慶應義塾大学（神奈川県・横浜市）
- Year and Date
  2016-03-12
- Related Report
  2015 Annual Research Report
[Presentation] 自律型アンドロイドERICAのための遠隔音声認識2016
- Author(s)
  井上昂治, 三村正人, 石井カルロス寿憲, 河原達也
- Organizer
  日本音響学会 2016年春季研究発表会
- Place of Presentation
  桐蔭横浜大学（神奈川県・横浜市）
- Year and Date
  2016-03-09
- Related Report
  2015 Annual Research Report
[Presentation] 自律型アンドロイドによる円滑な発話権制御のためのフィラーの生起位置と形態の分析2016
- Author(s)
  中西亮輔, 井上昂治, 中村静, 高梨克也, 河原達也
- Organizer
  人工知能学会言語・音声理解と対話処理研究会
- Place of Presentation
  中尾集落センター（長野県・下高井戸郡野沢温泉村）
- Year and Date
  2016-03-01
- Related Report
  2015 Annual Research Report
[Presentation] 傾聴対話システムのための言語情報と韻律情報に基づく多様な形態の相槌の生成2016
- Author(s)
  山口貴史, 井上昂治, 吉野幸一郎, 高梨克也, Nigel G. Ward, 河原達也
- Organizer
  人工知能学会言語・音声理解と対話処理研究会
- Place of Presentation
  中尾集落センター（長野県・下高井戸郡野沢温泉村）
- Year and Date
  2016-03-01
- Related Report
  2015 Annual Research Report
[Presentation] Analysis and prediction of morphological patterns of backchannels for attentive listening agents2016
- Author(s)
  Takashi Yamaguchi, Koji Inoue, Koichiro Yoshino, Katsuya Takanashi, Nigel G. Ward, Tatsuya Kawahara
- Organizer
  2016 International Workshop on Spoken Dialogue Systems
- Place of Presentation
  Lapland（Finland）
- Year and Date
  2016-01-14
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] Talking with ERICA, an autonomous android2016
- Author(s)
  Koji Inoue, Pierrick Milhorat, Divesh Lala, Tianyu Zhao, Tatsuya Kawahara
- Organizer
  SIGDIAL 2016
- Place of Presentation
  Los Angeles(The United States of America)
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] 自律型アンドロイドERICAによる社会的役割に則したインタラクション2016
- Author(s)
  井上昂治, Milhorat Pierrick, Lala Divesh, 趙天雨, 河原達也
- Organizer
  人工知能学会言語・音声理解と対話処理研究会
- Place of Presentation
  早稲田大学（東京都・新宿区）
- Related Report
  2016 Annual Research Report
[Presentation] 階層ベイズモデルを用いた聞き手の多様なふるまいに基づく対話エンゲージメントの推定2016
- Author(s)
  井上昂治, Lala Divesh, 高梨克也, 河原達也
- Organizer
  人工知能学会言語・音声理解と対話処理研究会
- Place of Presentation
  早稲田大学(東京都・新宿区)
- Related Report
  2016 Annual Research Report
[Presentation] Multimodal interaction with the autonomous android ERICA2016
- Author(s)
  Divesh Lala, Pierrick Milhorat, Koji Inoue, Tianyu Zhao, Tatsuya Kawahara
- Organizer
  ICMI 2016
- Place of Presentation
  日本科学未来館(東京都・江東区)
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] Annotation and analysis of listener's engagement based on multi-modal behaviors2016
- Author(s)
  Koji Inoue, Divesh Lala, Katsuya Takanashi, Tatsuya Kawahara
- Organizer
  ICMI 2016 workshop on Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction
- Place of Presentation
  タイム24ビル(東京都・江東区)
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] 自律型アンドロイドEricaのための音声対話システム2015
- Author(s)
  井上昂治, 河原達也
- Organizer
  人工知能学会言語・音声理解と対話処理研究会
- Place of Presentation
  早稲田大学（東京都・新宿区）
- Year and Date
  2015-10-29
- Related Report
  2015 Annual Research Report
[Presentation] 多様な相槌をうつ傾聴対話システムのための相槌形態の予測2015
- Author(s)
  山口貴史, 井上昂治, 吉野幸一郎, 高梨克也, Nigel G. Ward, 河原達也
- Organizer
  人工知能学会言語・音声理解と対話処理研究会
- Place of Presentation
  早稲田大学（東京都・新宿区）
- Year and Date
  2015-10-29
- Related Report
  2015 Annual Research Report
[Presentation] ポスター会話における音響・視線情報の確率的統合による話者区間及び相槌の検出2015
- Author(s)
  井上昂治, 若林佑幸, 吉本廣雅, 高梨克也, 河原達也
- Organizer
  日本音響学会 2015年秋季研究発表会
- Place of Presentation
  会津大学（福島県・会津若松市）
- Year and Date
  2015-09-17
- Related Report
  2015 Annual Research Report
[Presentation] Enhanced speaker diarization with detection of backchannels using eye-gaze information in poster conversations2015
- Author(s)
  Koji Inoue, Yukoh Wakabayashi, Hiromasa Yoshimoto, Katsuya Takanashi, and Tatsuya Kawahara
- Organizer
  INTERSPEECH 2015
- Place of Presentation
  Dresden（Germany）
- Year and Date
  2015-09-10
- Related Report
  2015 Annual Research Report
- Int'l Joint Research
[Presentation] スマートポスターボードにおける視線情報を用いた話者区間及び相槌の検出2015
- Author(s)
  井上昂治, 若林佑幸, 吉本廣雅, 高梨克也, 河原達也
- Organizer
  情報処理学会音楽情報科学研究会
- Place of Presentation
  電気通信大学（東京都・調布市）
- Year and Date
  2015-05-24
- Related Report
  2015 Annual Research Report

A spoken dialogue system based on estimation of mental state using multimodal processing

Principal Investigator

井上 昂治 京都大学, 情報学研究科, 特別研究員(DC1)

¥2,800,000 (Direct Cost: ¥2,800,000)

Report

Research Products

[Journal Article] Engagement Recognition from Listener’s Behaviors in Spoken Dialogue Using a Latent Character Model2018

Author(s)

Journal Title

DOI

NAID

ISSN

Related Report

[Journal Article] Speaker Diarization by Combining Acoustic and Eye-Gaze Information in Multi-Party Conversations2016

Author(s)

Journal Title

DOI

ISSN

Year and Date

Related Report

[Journal Article] Speaker Diarization and Source Number Estimation Based on Audio-Visual Integration2016

Author(s)

Journal Title

DOI

ISSN

Year and Date

Related Report

[Journal Article] Multi-modal sensing and analysis of poster conversations with smart posterboard2016

Author(s)

Journal Title

DOI

Related Report

[Presentation] Latent character model for engagement recognition based on multimodal behaviors2018

Author(s)

Organizer

Related Report

[Presentation] 自律型アンドロイドERICAにおけるエンゲージメント推定に基づく音声対話システム2018

Author(s)

Organizer

Related Report

[Presentation] Detection of social signals for recognizing engagement in human-robot interaction2017

Author(s)

Organizer

Related Report

[Presentation] 潜在キャラクタモデルによるリアルタイム対話エンゲージメント推定2017

Author(s)

Organizer

Related Report

[Presentation] 自律型アンドロイドERICAにおける多様な聞き手応答を用いた傾聴対話2017

Author(s)

Organizer

Related Report

[Presentation] 潜在キャラクタモデルによる聞き手のふるまいに基づく対話エンゲージメントの推定2017

Author(s)

Organizer

Related Report

[Presentation] DAEを用いたリアルタイム遠隔音声認識2017

Author(s)

Organizer

Place of Presentation

Related Report

[Presentation] 聞き手の多様なふるまいに基づく対話エンゲージメントの推定2017

Author(s)

Organizer

Place of Presentation

Related Report

[Presentation] 傾聴対話システムのための多様な聞き手応答の生成2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 初対面対話における場の和みのマルチモーダルな分析と検出2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 自律型アンドロイドERICAのための遠隔音声認識2016

Author(s)

井上昂治京都大学, 情報学研究科, 特別研究員(DC1)

[Presentation] ポスター会話における音響・視線情報の確率的統合による話者区間及び相槌の検出2015