Social Expression for Dialogue Robots with focus on Individuality

Publicly Offered Research

Project Area	Studies on intelligent systems for dialogue toward the human-machine symbiotic society
Project/Area Number	22H04875
Research Category	Grant-in-Aid for Scientific Research on Innovative Areas (Research in a proposed research area)
Allocation Type	Single-year Grants
Review Section	Complex systems
Research Institution	Advanced Telecommunications Research Institute International
Principal Investigator	石井カルロス寿憲株式会社国際電気通信基礎技術研究所, 石黒浩特別研究所, グループリーダー (30418529)
Project Period (FY)	2022-04-01 – 2024-03-31
Project Status	Completed (Fiscal Year 2023)
Budget Amount *help	¥21,840,000 (Direct Cost: ¥16,800,000、Indirect Cost: ¥5,040,000) Fiscal Year 2023: ¥10,920,000 (Direct Cost: ¥8,400,000、Indirect Cost: ¥2,520,000) Fiscal Year 2022: ¥10,920,000 (Direct Cost: ¥8,400,000、Indirect Cost: ¥2,520,000)
Keywords	音声情報処理 / 非言語情報処理 / パラ言語情報処理 / 人ロボットインタラクション / 音環境知能
Outline of Research at the Start	本研究では、研究提案者の研究チームがこれまで培ってきた知見の利活用・発展により、対話相手や状況に応じて人はどのように表出を変えるのか、また個人性によってこの表出をどのように変えるのかをも表現できる社会的表出の数理モデルを構築し、それをロボットやエージェントとのインタラクションに実装する。この研究により、人はロボットやエージェントとより自然に関われるようになり、ロボットやエージェントの利用範囲が格段に広がることが期待される。
Outline of Annual Research Achievements	本研究では、発話に伴う、人間らしい自然な話し方と動作を持つ対話ロボット・エージェントの実現を目的とし、特に、対話相手や状況に応じて人はどのように表出を変えるのかを個人性も考慮したうえで表現できる「社会的表出」の数理モデルを明らかにし、それをロボットやエージェントとのインタラクションに実装することを目指して、多方面から研究開発を進めた。対話ロボットが複数人と対話する場合の視線制御においては、対話役割と視線逸らしを考慮した手法を提案し、小型ロボットCommUおよびアンドロイドNIKOLAに視線動作を実装し、被験者実験による印象評定を行った。性格が異なる２名のモデルで生成した動作は、同じ声でも動きの違いにより、異なる外向性の印象を与えることが示された。深層生成モデルによる上半身や手振りジェスチャ生成においては、入力音声から抽出される韻律特徴を条件とした手振りジェスチャを生成する深層学習モデルを学習し、手振りの動きを３段階に分けてモデルの入力の条件として付加することにより、生成された動きの印象を明示的に制御できるような枠組みを提案した。CGアバターおよび小型ロボットCommUにおいて生成された動作の印象を評価した結果、外向性の印象および興奮度合いの印象と相関した動作が生成できることが示された。自然会話に出現する自発的な「楽しい笑い」と社会的な「愛想笑い」の音声特徴の分析も進めた。楽しい笑いには話者間で共通したパターンが観測された一方、愛想笑いには、強い気息音を含む、気息音を全く含まない、鼻音を含むなど、話者によって表出のバリエーションが多い傾向がみられた。笑いスタイルの個人差についても解析を進めた。その他、マルチモーダル意図認識の研究も進めたが、これまでの研究成果を踏まえ、エージェントの社会的表出の実現に向けて、今後も取り組んでいく。
Research Progress Status	令和5年度が最終年度であるため、記入しない。
Strategy for Future Research Activity	令和5年度が最終年度であるため、記入しない。

Report

(2 results)

2023 Annual Research Report
2022 Annual Research Report

Research Products
(30 results)

All 2024 2023 2022

All Journal Article (14 results) (of which Int'l Joint Research: 7 results, Peer Reviewed: 14 results, Open Access: 6 results) Presentation (16 results) (of which Int'l Joint Research: 12 results, Invited: 5 results)

[Journal Article] 複数人対話における視線動作の解析および対話ロボットの視線動作生成による個性の表出2024
- Author(s)
  新谷太健, 石井カルロス寿憲, 石黒浩
- Journal Title
  
  日本ロボット学会誌
  
  Volume: 42 Pages: 151-158
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Extrovert or Introvert? GAN-Based Humanoid Upper-Body Gesture Generation for Different Impressions2023
- Author(s)
  Wu Bowen、Liu Chaoran、Ishi Carlos Toshinori、Shi Jiaqi、Ishiguro Hiroshi
- Journal Title
  
  International Journal of Social Robotics
  
  Volume: - Issue: 3 Pages: 1-16
- DOI
  10.1007/s12369-023-01051-8
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] QUICKVC: A Lightweight VITS-Based Any-to-Many Voice Conversion Model using ISTFT for Faster Conversion2023
- Author(s)
  Guo Houjian、Liu Chaoran、Ishi Carlos Toshinori、Ishiguro Hiroshi
- Journal Title
  
  Proc. ASRU2023
  
  Volume: - Pages: 1-5
- DOI
  10.1109/asru57964.2023.10389621
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Int'l Joint Research
[Journal Article] Using Joint Training Speaker Encoder With Consistency Loss to Achieve Cross-Lingual Voice Conversion and Expressive Voice Conversion2023
- Author(s)
  Guo Houjian、Liu Chaoran、Ishi Carlos Toshinori、Ishiguro Hiroshi
- Journal Title
  
  Proc. ASRU2023
  
  Volume: - Pages: 1-5
- DOI
  10.1109/asru57964.2023.10389651
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Int'l Joint Research
[Journal Article] Recognizing Real-World Intentions using A Multimodal Deep Learning Approach with Spatial-Temporal Graph Convolutional Networks2023
- Author(s)
  Shi Jiaqi、Liu Chaoran、Ishi Carlos Toshinori、Wu Bowen、Ishiguro Hiroshi
- Journal Title
  
  Proc. IROS2023
  
  Volume: - Pages: 3819-3826
- DOI
  10.1109/iros55552.2023.10341981
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Int'l Joint Research
[Journal Article] Voice types and voice quality in Japanese anime2023
- Author(s)
  Carlos T. Ishi, Akira Utsugi, Ichiro Ota
- Journal Title
  
  Proc. ICPhS2023
  
  Volume: - Pages: 3632-3636
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Laughter patterns in multi-speaker conversation data: comparison between spontaneous laughter and intentional laughter2023
- Author(s)
  Kexin Wang, Carlos Ishi, Ryoko Hayashi
- Journal Title
  
  Proc. ICPhS2023
  
  Volume: - Pages: 1771-1775
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] An attention-based sound selective hearing support system: evaluation by subjects with age-related hearing loss2023
- Author(s)
  Ishi Carlos T.、Liu Chaoran、Minato Takashi
- Journal Title
  
  Proc. SII2023
  
  Volume: - Pages: 1-6
- DOI
  10.1109/sii55687.2023.10039165
- Related Report
  2022 Annual Research Report
- Peer Reviewed
[Journal Article] An improved CycleGAN-based emotional voice conversion model by augmenting temporal dependency with a transformer2022
- Author(s)
  Fu Changzeng、Liu Chaoran、Ishi Carlos Toshinori、Ishiguro Hiroshi
- Journal Title
  
  Speech Communication
  
  Volume: 144 Pages: 110-121
- DOI
  10.1016/j.specom.2022.09.002
- Related Report
  2022 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] An Adversarial Training Based Speech Emotion Classifier with Isolated Gaussian Regularization2022
- Author(s)
  Fu Changzeng、Liu Chaoran、Ishi Carlos、Ishiguro Hiroshi
- Journal Title
  
  IEEE Transactions on Affective Computing
  
  Volume: - Issue: 3 Pages: 1-1
- DOI
  10.1109/taffc.2022.3169091
- Related Report
  2022 Annual Research Report
- Peer Reviewed
[Journal Article] C-CycleTransGAN: A Non-parallel Controllable Cross-gender Voice Conversion Model with CycleGAN and Transformer2022
- Author(s)
  Fu Changzeng、Liu Chaoran、Ishi Carlos Toshinori、Ishiguro Hiroshi
- Journal Title
  
  Proc. APSIPA 2022
  
  Volume: - Pages: 1-7
- DOI
  10.23919/apsipaasc55919.2022.9979821
- Related Report
  2022 Annual Research Report
- Peer Reviewed
[Journal Article] Controlling the Impression of Robots via GAN-based Gesture Generation2022
- Author(s)
  Wu Bowen、Shi Jiaqi、Liu Chaoran、Ishi Carlos T.、Ishiguro Hiroshi
- Journal Title
  
  Proc. IROS22
  
  Volume: - Pages: 9288-9295
- DOI
  10.1109/iros47612.2022.9981535
- Related Report
  2022 Annual Research Report
- Peer Reviewed
[Journal Article] Expression of Personality by Gaze Movements of an Android Robot in Multi-Party Dialogues2022
- Author(s)
  Shintani Taiken、Ishi Carlos Toshinori、Ishiguro Hiroshi
- Journal Title
  
  Proc. RO-MAN22
  
  Volume: - Pages: 1534-1541
- DOI
  10.1109/ro-man53752.2022.9900812
- Related Report
  2022 Annual Research Report
- Peer Reviewed
[Journal Article] Prosodic and Voice Quality Analyses of Filled Pauses in Japanese Spontaneous Conversation by Chinese learners and Japanese Native Speakers2022
- Author(s)
  Li Xinyue、Ishi Carlos Toshinori、Fu Changzeng、Hayashi Ryoko
- Journal Title
  
  Proc. Speech Prosody 2022
  
  Volume: - Pages: 550-554
- DOI
  10.21437/speechprosody.2022-112
- Related Report
  2022 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Presentation] QUICKVC: A Lightweight VITS-Based Any-to-Many Voice Conversion Model using ISTFT for Faster Conversion2023
- Author(s)
  Guo Houjian、Liu Chaoran、Ishi Carlos Toshinori、Ishiguro Hiroshi
- Organizer
  ASRU2023
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Using Joint Training Speaker Encoder With Consistency Loss to Achieve Cross-Lingual Voice Conversion and Expressive Voice Conversion2023
- Author(s)
  Guo Houjian、Liu Chaoran、Ishi Carlos Toshinori、Ishiguro Hiroshi
- Organizer
  ASRU2023
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Recognizing Real-World Intentions using A Multimodal Deep Learning Approach with Spatial-Temporal Graph Convolutional Networks2023
- Author(s)
  Shi Jiaqi、Liu Chaoran、Ishi Carlos Toshinori、Wu Bowen、Ishiguro Hiroshi
- Organizer
  IROS2023
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Voice types and voice quality in Japanese anime2023
- Author(s)
  Carlos T. Ishi, Akira Utsugi, Ichiro Ota
- Organizer
  ICPhS2023
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Laughter patterns in multi-speaker conversation data: comparison between spontaneous laughter and intentional laughter2023
- Author(s)
  Kexin Wang, Carlos Ishi, Ryoko Hayashi
- Organizer
  ICPhS2023
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Multimodal speech processing for dialogue robots and applications of sound environment intelligence2023
- Author(s)
  Carlos T. Ishi
- Organizer
  RO-MAN 2023 - Workshop on Speech-based communication for robots and systems
- Related Report
  2023 Annual Research Report
- Int'l Joint Research / Invited
[Presentation] マルチモーダル音声情報処理と対話ロボットへの応用2023
- Author(s)
  石井カルロス寿憲
- Organizer
  富山県立大学　特別講義
- Related Report
  2023 Annual Research Report
- Invited
[Presentation] An attention-based sound selective hearing support system: evaluation by subjects with age-related hearing loss2023
- Author(s)
  C.T. Ishi, C. Liu, T. Minato
- Organizer
  2023 IEEE/SICE International Symposium on System Integration (SII)
- Related Report
  2022 Annual Research Report
- Int'l Joint Research
[Presentation] マルチモーダル音声情報処理と対話ロボットへの応用2023
- Author(s)
  石井カルロス寿憲
- Organizer
  富山県立大学特別講義I
- Related Report
  2022 Annual Research Report
- Invited
[Presentation] C-CycleTransGAN: A Non-Parallel Controllable Cross-Gender Voice Conversion Model With CycleGAN and Transformer2022
- Author(s)
  C. Fu, C. Liu, C. T. Ishi, H. Ishiguro
- Organizer
  2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
- Related Report
  2022 Annual Research Report
- Int'l Joint Research
[Presentation] Controlling the Impression of Robots via GAN-based Gesture Generation2022
- Author(s)
  B. Wu, J. Shi, C. Liu, C.T. Ishi, H. Ishiguro
- Organizer
  2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- Related Report
  2022 Annual Research Report
- Int'l Joint Research
[Presentation] Expression of Personality by Gaze Movements of an Android Robot in Multi-Party Dialogues2022
- Author(s)
  T. Shintani, C.T. Ishi, H. Ishiguro
- Organizer
  2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)
- Related Report
  2022 Annual Research Report
- Int'l Joint Research
[Presentation] Prosodic and Voice Quality Analyses of Filled Pauses in Japanese Spontaneous Conversation by Chinese learners and Japanese Native Speakers2022
- Author(s)
  X. Li, C.T. Ishi, C. Fu, R. Hayashi
- Organizer
  Speech Prosody 2022
- Related Report
  2022 Annual Research Report
- Int'l Joint Research
[Presentation] Analysis and generation of speech-related motions, and evaluation in humanoid robots2022
- Author(s)
  C.T. Ishi
- Organizer
  ICMI GENEA (Generation and Evaluation of Non-verbal Behaviour for Embodied Agents) Workshop 2022
- Related Report
  2022 Annual Research Report
- Int'l Joint Research / Invited
[Presentation] 声質の科学：音響特徴、EGG特性およびパラ言語的機能2022
- Author(s)
  石井カルロス寿憲
- Organizer
  日本音響学会音声研究会 (ASJ-SP)
- Related Report
  2022 Annual Research Report
- Invited
[Presentation] 自由会話における「楽しい笑い」と「愛想笑い」の音声的特徴ー予備的分析2022
- Author(s)
  王可心、石井カルロス寿憲、林良子
- Organizer
  日本音響学会第２５回関西支部若手研究者交流研究発表会
- Related Report
  2022 Annual Research Report

Social Expression for Dialogue Robots with focus on Individuality

Principal Investigator

石井 カルロス寿憲 株式会社国際電気通信基礎技術研究所, 石黒浩特別研究所, グループリーダー (30418529)

¥21,840,000 (Direct Cost: ¥16,800,000、Indirect Cost: ¥5,040,000)

Report

Research Products

[Journal Article] 複数人対話における視線動作の解析および対話ロボットの視線動作生成による個性の表出2024

Author(s)

Journal Title

Related Report

[Journal Article] Extrovert or Introvert? GAN-Based Humanoid Upper-Body Gesture Generation for Different Impressions2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] QUICKVC: A Lightweight VITS-Based Any-to-Many Voice Conversion Model using ISTFT for Faster Conversion2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Using Joint Training Speaker Encoder With Consistency Loss to Achieve Cross-Lingual Voice Conversion and Expressive Voice Conversion2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Recognizing Real-World Intentions using A Multimodal Deep Learning Approach with Spatial-Temporal Graph Convolutional Networks2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Voice types and voice quality in Japanese anime2023

Author(s)

Journal Title

Related Report

[Journal Article] Laughter patterns in multi-speaker conversation data: comparison between spontaneous laughter and intentional laughter2023

Author(s)

Journal Title

Related Report

[Journal Article] An attention-based sound selective hearing support system: evaluation by subjects with age-related hearing loss2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] An improved CycleGAN-based emotional voice conversion model by augmenting temporal dependency with a transformer2022

Author(s)

Journal Title

DOI

Related Report

[Journal Article] An Adversarial Training Based Speech Emotion Classifier with Isolated Gaussian Regularization2022

Author(s)

Journal Title

DOI

Related Report

[Journal Article] C-CycleTransGAN: A Non-parallel Controllable Cross-gender Voice Conversion Model with CycleGAN and Transformer2022

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Controlling the Impression of Robots via GAN-based Gesture Generation2022

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Expression of Personality by Gaze Movements of an Android Robot in Multi-Party Dialogues2022

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Prosodic and Voice Quality Analyses of Filled Pauses in Japanese Spontaneous Conversation by Chinese learners and Japanese Native Speakers2022

Author(s)

Journal Title

DOI

Related Report

[Presentation] QUICKVC: A Lightweight VITS-Based Any-to-Many Voice Conversion Model using ISTFT for Faster Conversion2023

Author(s)

Organizer

Related Report

[Presentation] Using Joint Training Speaker Encoder With Consistency Loss to Achieve Cross-Lingual Voice Conversion and Expressive Voice Conversion2023

Author(s)

Organizer

石井カルロス寿憲株式会社国際電気通信基礎技術研究所, 石黒浩特別研究所, グループリーダー (30418529)