楽しい雑談対話の要因解明のためのリアルなＣＧとのマルチモーダル対話システム構築

Publicly Offered Research

Project Area	Studies on intelligent systems for dialogue toward the human-machine symbiotic society
Project/Area Number	20H05562
Research Category	Grant-in-Aid for Scientific Research on Innovative Areas (Research in a proposed research area)
Allocation Type	Single-year Grants
Review Section	Complex systems
Research Institution	Toyohashi University of Technology
Principal Investigator	北岡教英豊橋技術科学大学, 工学(系)研究科(研究院), 教授 (10333501)
Project Period (FY)	2020-04-01 – 2022-03-31
Project Status	Completed (Fiscal Year 2021)
Budget Amount *help	¥11,700,000 (Direct Cost: ¥9,000,000、Indirect Cost: ¥2,700,000) Fiscal Year 2021: ¥5,980,000 (Direct Cost: ¥4,600,000、Indirect Cost: ¥1,380,000) Fiscal Year 2020: ¥5,720,000 (Direct Cost: ¥4,400,000、Indirect Cost: ¥1,320,000)
Keywords	フォトリアルCG / 音声対話 / マルチモーダル対話 / CGエージェント
Outline of Research at the Start	人間-機械共生・協奏を目指し、リアルなアンドロイドやCG エージェントとの、より人間らしい楽しめる対話・雑談ができることが必要となってきた。そこで、フォトリアルな高校生CG エージェント「Saya」に着目し、まるで人間と話すかのように音声や表情・視線を用いたマルチモーダル対話ができるシステムを構築し、「音声・マルチモーダル対話が楽しめる要因は何か？」を明らかにする。そのために、雑談対話に必要な実時間で高精度な音声・表情・ジェスチャ認識、多様な入出力や文脈に合わせた韻律・ジェスチャ制御に基づく応答生成、を実現し、フォトリアリスティックなCG との対話システムを構築して対話実験を行う。
Outline of Annual Research Achievements	将来の人間－機械協奏社会を考えると，機械がいかに人間に近い存在になり自然で容易にコミュニケーションが取れるようになるかは重要な課題となる。人間に近い姿をすることは一つの可能性であり、我々は本物の人間と区別がつかない3D CG「Saya」に注目し，Sayaをエージェントとして音声・マルチモーダル対話を行えるシステムの構築を行った。みかけがリアルである分、音声対話の応答内容もリアルでないと不釣り合いになる。そうしたリアルな応答生成の手段として、ChatGPTが大きく取り上げられる中、履歴から次の発話を生成するのみのChatGPTのような生成モデルでは内容が制御しにくいという問題がある。それに対してどのような話題で応答を生成したいかを与えてその話題に近い発話をするようなデータセットを用意し、それによってファインチューニングすることで、応答生成時にも話題を与えることで話題を制御できる方法を考案した。また、こうして生成された応答を、相手が話しを終えて発話権が移ったうえで音声として発する必要がある。そのために、今の相手の発話の切れ目（無音）を発話終端とみなして話してよいかどうかを判定する発話終端検出手法を提案した。そして、提案しているROSベースのリアルタイム音声対話システム上に実装した。
Research Progress Status	令和3年度が最終年度であるため、記入しない。
Strategy for Future Research Activity	令和3年度が最終年度であるため、記入しない。

Report

(2 results)

2021 Annual Research Report
2020 Annual Research Report

Research Products
(14 results)

All 2023 2022 2021

All Journal Article (5 results) (of which Peer Reviewed: 3 results, Open Access: 4 results) Presentation (9 results) (of which Int'l Joint Research: 3 results)

[Journal Article] Input Utterance Complementation Method by Anaphora Resolution for Spontaneous Utterances on Spoken Dialog Systems2022
- Author(s)
  Nishimura Ryota、Mori Raita、Ohta Kengo、Kitaoka Norihide
- Journal Title
  
  Transactions of the Japanese Society for Artificial Intelligence
  
  Volume: 37 Issue: 3 Pages: IDS-F_1-13
- DOI
  10.1527/tjsai.37-3_IDS-F
- ISSN
  1346-0714, 1346-8030
- Year and Date
  2022-05-01
- Related Report
  2021 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Multimodal dialog with photorealistic CG agent2022
- Author(s)
  北岡教英、西村良太、太田健吾
- Journal Title
  
  THE JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN
  
  Volume: 78 Issue: 5 Pages: 257-264
- DOI
  10.20697/jasj.78.5_257
- ISSN
  0369-4232, 2432-2040
- Year and Date
  2022-05-01
- Related Report
  2021 Annual Research Report
- Open Access
[Journal Article] Response type selection for chat-like spoken dialog systems based on LSTM and multi-task learning2021
- Author(s)
  Ohta Kengo、Nishimura Ryota、Kitaoka Norihide
- Journal Title
  
  Speech Communication
  
  Volume: 133 Pages: 23-30
- DOI
  10.1016/j.specom.2021.07.003
- Related Report
  2020 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Dynamic out-of-vocabulary word registration to language model for speech recognition2021
- Author(s)
  Kitaoka Norihide、Chen Bohan、Obashi Yuya
- Journal Title
  
  EURASIP Journal on Audio, Speech, and Music Processing
  
  Volume: 2021 Issue: 1 Pages: 1-8
- DOI
  10.1186/s13636-020-00193-1
- Related Report
  2020 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] 次世代の移動を支えるマルチモーダルエージェント“Saya”2021
- Author(s)
  大須賀晋, 田中五大, 鍋倉彩那, 藤井宏行, 中野涼太, 渡邊凌太, TELYUKA, 太田健吾, 西村良太, 北岡教英
- Journal Title
  
  自動車技術
  
  Volume: 75 Pages: 9-9
- Related Report
  2020 Annual Research Report
[Presentation] 割り込み発話にも対応可能なリアルタイム話者交替システム2023
- Author(s)
  杉山雅和，西村良太，太田健吾，北岡教英
- Organizer
  日本音響学会春季研究発表会
- Related Report
  2021 Annual Research Report
[Presentation] A response generation method of chat-bot system using input formatting and reference resolution2022
- Author(s)
  Takahiro Kinouchi, Norihide Kitaoka
- Organizer
  ICAICTA-2022
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] EMOtive A.I. "Saya”2022
- Author(s)
  大須賀晋，田中五大，鍋倉彩那，中野涼太，渡邊凌太，石川友香，石川晃之，中村晃一，藤井裕也，堀内颯太，東中竜一郎，西村良太，太田健吾，北岡教英
- Organizer
  人工知能学会言語・音声理解と対話処理研究会
- Related Report
  2021 Annual Research Report
[Presentation] タスク外音響情報を付加したEnd-to-End音声認識モデルの設計2022
- Author(s)
  森大輝，太田健吾，西村良太，小川厚徳, 北岡教英
- Organizer
  日本音響学会
- Related Report
  2020 Annual Research Report
[Presentation] 非流暢ラベルを用いた言い淀み整形End-to-End音声認識2022
- Author(s)
  堀井こはる，福田芽衣子，太田健吾，西村良太，小川厚徳，北岡教英
- Organizer
  日本音響学会
- Related Report
  2020 Annual Research Report
[Presentation] Advanced language model fusion method for encoder-decoder model in Japanese speech2021
- Author(s)
  Daiki Mori, Kengo Ohta, Ryota Nishimura, Atsunori Ogawa, Norihide Kitaoka
- Organizer
  APSIPA ASC 2021
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] End-to-end spontaneous speech recognition using hesitation labeling2021
- Author(s)
  Koharu Horii, Meiko Fukuda, Kengo Ohta, Ryota Nishimura, Atsunori Ogawa, Norihide Kitaoka
- Organizer
  APSIPA ASC 2021
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] Encoder-Decoder音声認識モデルにおける暗黙的言語情報の置換法2021
- Author(s)
  森大輝，太田健吾，西村良太，小川厚徳，北岡教英
- Organizer
  日本音響学会
- Related Report
  2020 Annual Research Report
[Presentation] 言い淀みを考慮した自由発話のEnd-to-End音声認識2021
- Author(s)
  堀井こはる，福田芽衣子，太田健吾，西村良太，北岡教英
- Organizer
  日本音響学会
- Related Report
  2020 Annual Research Report

楽しい雑談対話の要因解明のためのリアルなＣＧとのマルチモーダル対話システム構築

Principal Investigator

北岡 教英 豊橋技術科学大学, 工学(系)研究科(研究院), 教授 (10333501)

¥11,700,000 (Direct Cost: ¥9,000,000、Indirect Cost: ¥2,700,000)

Report

Research Products

[Journal Article] Input Utterance Complementation Method by Anaphora Resolution for Spontaneous Utterances on Spoken Dialog Systems2022

Author(s)

Journal Title

DOI

ISSN

Year and Date

Related Report

[Journal Article] Multimodal dialog with photorealistic CG agent2022

Author(s)

Journal Title

DOI

ISSN

Year and Date

Related Report

[Journal Article] Response type selection for chat-like spoken dialog systems based on LSTM and multi-task learning2021

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Dynamic out-of-vocabulary word registration to language model for speech recognition2021

Author(s)

Journal Title

DOI

Related Report

[Journal Article] 次世代の移動を支えるマルチモーダルエージェント“Saya”2021

Author(s)

Journal Title

Related Report

[Presentation] 割り込み発話にも対応可能なリアルタイム話者交替システム2023

Author(s)

Organizer

Related Report

[Presentation] A response generation method of chat-bot system using input formatting and reference resolution2022

Author(s)

Organizer

Related Report

[Presentation] EMOtive A.I. "Saya”2022

Author(s)

Organizer

Related Report

[Presentation] タスク外音響情報を付加したEnd-to-End音声認識モデルの設計2022

Author(s)

Organizer

Related Report

[Presentation] 非流暢ラベルを用いた言い淀み整形End-to-End音声認識2022

Author(s)

Organizer

Related Report

[Presentation] Advanced language model fusion method for encoder-decoder model in Japanese speech2021

Author(s)

Organizer

Related Report

[Presentation] End-to-end spontaneous speech recognition using hesitation labeling2021

Author(s)

Organizer

Related Report

[Presentation] Encoder-Decoder音声認識モデルにおける暗黙的言語情報の置換法2021

Author(s)

Organizer

Related Report

[Presentation] 言い淀みを考慮した自由発話のEnd-to-End音声認識2021

Author(s)

Organizer

Related Report

北岡教英豊橋技術科学大学, 工学(系)研究科(研究院), 教授 (10333501)