2022 Fiscal Year Final Research Report
Adversarial Data Augmentation for Multimodal Language Understanding
Project/Area Number |
20H04269
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Review Section |
Basic Section 61050:Intelligent robotics-related
|
Research Institution | Keio University |
Principal Investigator |
Sugiura Komei 慶應義塾大学, 理工学部(矢上), 教授 (60470473)
|
Project Period (FY) |
2020-04-01 – 2023-03-31
|
Keywords | マルチモーダル言語処理 / クロスモーダル言語生成 / データ拡張 / 生活支援ロボット / Sim2Real |
Outline of Final Research Achievements |
In this study, our objectives are (a) robust multimodal language understanding through adversarial data augmentation, (b) multimodal language generation, and (c) evaluation in the assistance dog tasks. We first focused on the Vision-and-Language Navigation task and developed the Momentum-based Adversarial Training (MAT) algorithm. We applied MAT to the standard benchmark test, ALFRED, and obtained successful results. We also worked on the task of generating descriptions about future situations. The main novelty of our proposed method lies in the use of Relational Self-Attention as the attention mechanism. Experimental results show that our method outperformed existing methods in standard metrics. We applied the multimodal language understanding and generation methods into a simulator, enabling on-the-fly instruction generation. As a result, we established a robot evaluation framework that does not require manual intervention in task generation, execution, and evaluation.
|
Free Research Field |
機械知能, 知能ロボティクス, マルチモーダル言語処理
|
Academic Significance and Societal Importance of the Research Achievements |
本研究では,要支援者とその家族を時間的拘束から解放するために,日常タスクを支援する生活支援ロボットの言語理解技術構築を目的とする.生活支援ロボットのハードウェアは最近標準化されたものの,曖昧な指示を理解する精度は不十分である.本研究では,マルチモーダル言語理解に関する標準データセット上で世界最高精度を達成するとともに,タスク生成・実行・評価のすべてにおいて人手を要しない生活支援ロボット評価フレームワークを世界で初めて構築した.
|