2023 Fiscal Year Final Research Report

Development of Motion Generation Technology to Realize Robots that Perform Various Tasks according to Natural Language Instructions

Research Project

PDF

Project/Area Number	21H04910
Research Category	Grant-in-Aid for Scientific Research (A)
Allocation Type	Single-year Grants
Section	一般
Review Section	Medium-sized Section 61:Human informatics and related fields
Research Institution	OMRON SINIC X Corporation
Principal Investigator	Hashimoto Atsushi オムロンサイニックエックス株式会社, リサーチアドミニストレイティブディビジョン, シニアリサーチャー (80641753)
Co-Investigator(Kenkyū-buntansha)	井上中順東京工業大学, 情報理工学院, 准教授 (10733397) 牛久祥孝オムロンサイニックエックス株式会社, リサーチアドミニストレイティブディビジョン, プリンシパルインベスティゲーター (10784142) 濱屋政志オムロンサイニックエックス株式会社, リサーチアドミニストレイティブディビジョン, シニアリサーチャー (10869176) 松原崇充奈良先端科学技術大学院大学, 先端科学技術研究科, 教授 (20508056) 森信介京都大学, 学術情報メディアセンター, 教授 (90456773) ベルトランエルナンデスクリスティアンカミロオムロンサイニックエックス株式会社, リサーチアドミニストレイティブディビジョン, リサーチエンジニア (30984017) VON・DRIGALSKI FELIX オムロンサイニックエックス株式会社, リサーチアドミニストレイティブディビジョン, シニアリサーチャー (90869215)
Project Period (FY)	2021-04-05 – 2024-03-31
Keywords	自然言語処理 / クロスモーダル処理 / ロボティクス
Outline of Final Research Achievements	In this project, we developed a general-purpose computational model to make robots perform tasks based on verbal instructions. The design was based on Norman's seven stages of action model. We nearly completed the action flow for a robot controlled by verbal commands, using a salad as a test subject. By employing the Planning Domain Definition Language (PDDL), we enhanced the system's explainability and reliability, achieving control without the need for training data. This allowed us to create a more practical system compared to conventional black-box type systems.
Free Research Field	コンピュータビジョン
Academic Significance and Societal Importance of the Research Achievements	本プロジェクトでは言語指示に基づきロボットを操作する技術を開発した。同時期に複数の研究機関も類似の研究を行ったが、これらの研究では「意図の推定」という段階が見落とされていた。そのため、これらのシステムはロボットの身体性に依存し、個々のロボットに特化したモデルを開発する必要があった。さらに、動作の安全性の確保や言語指示が正しく解釈されたかの判定が難しくなる問題も存在した。本プロジェクトでは、言語指示とシーン観測から物体の初期状態と目標状態を出力する方法を確立した。これにより、特定のロボットに依存せず、結果を解釈可能で、かつ、安全性を担保すること可能となった。