Development of Motion Generation Technology to Realize Robots that Perform Various Tasks according to Natural Language Instructions

Research Project

Project/Area Number	21H04910
Research Category	Grant-in-Aid for Scientific Research (A)
Allocation Type	Single-year Grants
Section	一般
Review Section	Medium-sized Section 61:Human informatics and related fields
Research Institution	OMRON SINIC X Corporation
Principal Investigator	Hashimoto Atsushi オムロンサイニックエックス株式会社, リサーチアドミニストレイティブディビジョン, シニアリサーチャー (80641753)
Co-Investigator(Kenkyū-buntansha)	井上中順東京工業大学, 情報理工学院, 准教授 (10733397) 牛久祥孝オムロンサイニックエックス株式会社, リサーチアドミニストレイティブディビジョン, プリンシパルインベスティゲーター (10784142) 濱屋政志オムロンサイニックエックス株式会社, リサーチアドミニストレイティブディビジョン, シニアリサーチャー (10869176) 松原崇充奈良先端科学技術大学院大学, 先端科学技術研究科, 教授 (20508056) 森信介京都大学, 学術情報メディアセンター, 教授 (90456773) ベルトランエルナンデスクリスティアンカミロオムロンサイニックエックス株式会社, リサーチアドミニストレイティブディビジョン, リサーチエンジニア (30984017) VON・DRIGALSKI FELIX オムロンサイニックエックス株式会社, リサーチアドミニストレイティブディビジョン, シニアリサーチャー (90869215)
Project Period (FY)	2021-04-05 – 2024-03-31
Project Status	Completed (Fiscal Year 2023)
Budget Amount *help	¥41,990,000 (Direct Cost: ¥32,300,000、Indirect Cost: ¥9,690,000) Fiscal Year 2023: ¥15,470,000 (Direct Cost: ¥11,900,000、Indirect Cost: ¥3,570,000) Fiscal Year 2022: ¥14,820,000 (Direct Cost: ¥11,400,000、Indirect Cost: ¥3,420,000) Fiscal Year 2021: ¥11,700,000 (Direct Cost: ¥9,000,000、Indirect Cost: ¥2,700,000)
Keywords	自然言語処理 / クロスモーダル処理 / ロボティクス
Outline of Research at the Start	生産年齢人口が減少する中，ロボットの産業活用は喫緊の課題である．ロボットによる作業代替を低コストで実現する方法として言語指示の活用が注目されている．しかし，「言語指示→ロボット制御」の従来型演算モデルは特定の作業に特化したものとなってしまっている．本研究では，多様な作業を対象とした汎用的な演算モデルを提案・検証する．言語・映像資源が豊富な調理を対象とし，サラダなどの比較的簡単な料理を言語指示に従って調理するロボットを最終年度までに実現することでコンセプト実証を目指す．
Outline of Final Research Achievements	In this project, we developed a general-purpose computational model to make robots perform tasks based on verbal instructions. The design was based on Norman's seven stages of action model. We nearly completed the action flow for a robot controlled by verbal commands, using a salad as a test subject. By employing the Planning Domain Definition Language (PDDL), we enhanced the system's explainability and reliability, achieving control without the need for training data. This allowed us to create a more practical system compared to conventional black-box type systems.
Academic Significance and Societal Importance of the Research Achievements	本プロジェクトでは言語指示に基づきロボットを操作する技術を開発した。同時期に複数の研究機関も類似の研究を行ったが、これらの研究では「意図の推定」という段階が見落とされていた。そのため、これらのシステムはロボットの身体性に依存し、個々のロボットに特化したモデルを開発する必要があった。さらに、動作の安全性の確保や言語指示が正しく解釈されたかの判定が難しくなる問題も存在した。本プロジェクトでは、言語指示とシーン観測から物体の初期状態と目標状態を出力する方法を確立した。これにより、特定のロボットに依存せず、結果を解釈可能で、かつ、安全性を担保すること可能となった。

Report

(5 results)

2023 Annual Research Report Final Research Report ( PDF )
2022 Annual Research Report
2021 Comments on the Screening Results Annual Research Report

Research Products
(25 results)

All 2024 2023 2022 2021 Other

All Journal Article (5 results) (of which Peer Reviewed: 3 results, Open Access: 2 results) Presentation (16 results) (of which Int'l Joint Research: 10 results, Invited: 3 results) Remarks (3 results) Patent(Industrial Property Rights) (1 results)

[Journal Article] Recipe Generation from Unsegmented Cooking Videos2024
- Author(s)
  Nishimura Taichi、Hashimoto Atsushi、Ushiku Yoshitaka、Kameko Hirotaka、Mori Shinsuke
- Journal Title
  
  ACM Transactions on Multimedia Computing, Communications, and Applications
  
  Volume: -
- DOI
  10.1145/3649137
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] State-aware video procedural captioning2023
- Author(s)
  Nishimura Taichi、Hashimoto Atsushi、Ushiku Yoshitaka、Kameko Hirotaka、Mori Shinsuke
- Journal Title
  
  Multimedia Tools and Applications
  
  Volume: 82 Issue: 24 Pages: 37273-37301
- DOI
  10.1007/s11042-023-14774-7
- Related Report
  2023 Annual Research Report
- Peer Reviewed
[Journal Article] Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows2023
- Author(s)
  白井圭佑, 橋本敦史, 西村太一, 亀甲博貴, 栗田修平, 森信介
- Journal Title
  
  Journal of Natural Language Processing
  
  Volume: 30 Issue: 3 Pages: 1042-1060
- DOI
  10.5715/jnlp.30.1042
- ISSN
  1340-7619, 2185-8314
- Related Report
  2023 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Learning by Breaking: Food Fracture Anticipation for Robotic Food Manipulation2022
- Author(s)
  Ishikawa Reina、Hamaya Masashi、Von Drigalski Felix、Tanaka Kazutoshi、Hashimoto Atsushi
- Journal Title
  
  IEEE Access
  
  Volume: 10 Pages: 99321-99329
- DOI
  10.1109/access.2022.3207491
- Related Report
  2022 Annual Research Report
[Journal Article] BioVL2: An Egocentric Biochemical Video-and-Language Dataset2022
- Author(s)
  Nishimura Taichi、Sakoda Kojiro、Ushiku Atsushi、Hashimoto Atsushi、Okuda Natsuko、Ono Fumihito、Kameko Hirotaka、Mori Shinsuke
- Journal Title
  
  Journal of Natural Language Processing
  
  Volume: 29 Issue: 4 Pages: 1106-1137
- DOI
  10.5715/jnlp.29.1106
- ISSN
  1340-7619, 2185-8314
- Related Report
  2022 Annual Research Report
[Presentation] Vision-Language Interpreter for Robot Task Planning2024
- Author(s)
  Keisuke Shirai, Cristian C. Beltran-Hernandez, Masashi Hamaya, Atsushi Hashimoto, Shohei Tanaka, Kento Kawaharazuka, Kazutoshi Tanaka, Yoshitaka Ushiku, and Shinsuke Mori
- Organizer
  International Conference on Robotics and Automation
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] PolarDB: Formula-driven Dataset for Pre-training Trajectory Encoders2024
- Author(s)
  Sota Miyamoto, Takuma Yagi, Yuto Makimoto, Mahiro Ukai, Yoshitaka Ushiku, Atsushi Hashimoto, Nakamasa Inoue
- Organizer
  International Conference on Acoustics, Speech, and Signal Processing
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] 「人と機械の融和における生成AIの社会実装」2024
- Author(s)
  橋本敦史
- Organizer
  日本鉄鋼協会　計測・制御・システム工学部会　シンポジウム「生成AIの産業応用における期待と課題」
- Related Report
  2023 Annual Research Report
- Invited
[Presentation] 調理作業理解のための言語資源付き固定視点映像データセットの構築2024
- Author(s)
  橋本敦史, 前田航希, 平澤寅庄, 原島純, Rybicki Leszek, 深澤祐援, 牛久祥孝
- Organizer
  人工知能学会全国大会
- Related Report
  2023 Annual Research Report
[Presentation] SliceIt!--A Dual Simulator Framework for Learning Robot Food Slicing2024
- Author(s)
  Cristian C. Beltran-Hernandez, Nicolas Erbetti, and Masashi Hamaya.
- Organizer
  International Conference on Robotics and Automation
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Integrated Task and Motion Planning for Real-World Cooking Tasks2024
- Author(s)
  Jeremy Siburian, Cristian Camilo Beltran-Hernandez, Masashi Hamaya
- Organizer
  International Conference on Robotics and Automation Workshop
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Learning Food Picking without Food: Fracture Anticipation by Breaking Reusable Fragile Objects2023
- Author(s)
  Rinto Yagawa, Rena Ishikawa, Masashi Hamaya, Kazutoshi Tanaka, Atsushi Hashimoto, Hideo Saito
- Organizer
  International Conference on Robotics and Automation
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] Deep Segmented DMP Networks for Learning Discontinuous Motions2023
- Author(s)
  Edgar Anarossi, Hirotaka Tahara, Naoto Komeno, Takamitsu Matsubara
- Organizer
  IEEE International Conference on Automation Science and Engineering
- Related Report
  2023 Annual Research Report
- Int'l Joint Research
[Presentation] 手の軌道特徴を用いた一人称視点料理動画における詳細動作認識2023
- Author(s)
  宮本蒼太, 八木拓真, 牛久祥孝, 橋本敦史, 井上中順
- Organizer
  電子情報通信学会PRMU研究会
- Related Report
  2022 Annual Research Report
[Presentation] Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows2022
- Author(s)
  Keisuke Shirai, Atsushi Hashimoto, Taichi Nishimura, Hirotaka Kameko, Shuhei Kurita, Yoshitaka Ushiku, Shinsuke Mori
- Organizer
  COLING2022
- Related Report
  2022 Annual Research Report
- Int'l Joint Research
[Presentation] Cross-modal Representation Learning for Understanding Manufacturing Procedure2022
- Author(s)
  Atsushi Hashimoto, Taichi Nishimura, Yoshitaka Ushiku, Hirotaka Kameko, Shinsuke Mori
- Organizer
  HCII2022
- Related Report
  2022 Annual Research Report
- Int'l Joint Research
[Presentation] レシピ分野における動作対象の状態変化を考慮したデータセットの構築と検索モデルの提案2022
- Author(s)
  白井圭佑, 橋本敦史, 牛久祥孝, 栗田修平, 亀甲博貴, 森信介
- Organizer
  言語処理学会第28回年次大会
- Related Report
  2021 Annual Research Report
[Presentation] Egocentric Biochemical Video-and-Language Dataset2021
- Author(s)
  Nishimura Taichi、Sakoda Kojiro、Hashimoto Atsushi、Ushiku Yoshitaka、Tanaka Natsuko、Ono Fumihito、Kameko Hirotaka、Mori Shinsuke
- Organizer
  The 4th Workshop on Closing the Loop Between Vision and Language in conjunction with ICCV2021
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] State-aware Video Procedural Captioning2021
- Author(s)
  Nishimura Taichi、Hashimoto Atsushi、Ushiku Yoshitaka、Kameko Hirotaka、Mori Shinsuke
- Organizer
  The 29th ACM International Conference on Multimedia
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] 自然言語に応じて多様な作業を行うロボット実現に向けたクロスモーダル機械学習の取り組み2021
- Author(s)
  橋本敦史
- Organizer
  日本ロボット学会データ工学ロボティクス研究専門委員会主催公開講演会
- Related Report
  2021 Annual Research Report
- Invited
[Presentation] クロスモーダル処理技術が統合する視覚・言語・ロボット制御技術の未来2021
- Author(s)
  橋本敦史
- Organizer
  日本機械学会年次大会
- Related Report
  2021 Annual Research Report
- Invited
[Remarks] Vision-Language Interpreter
- URL
  https://kskshr.github.io/vilain/
- Related Report
  2023 Annual Research Report
[Remarks] SliceIt!
- URL
  https://omron-sinicx.github.io/sliceit/
- Related Report
  2023 Annual Research Report
[Remarks] Integrated TaMP for Real-World Cooking Tasks
- URL
  https://www.youtube.com/watch?v=PS0CYS2NgZY
- Related Report
  2023 Annual Research Report
[Patent(Industrial Property Rights)] 制御装置、制御方法、及び制御プログラム2022
- Inventor(s)
  濱屋政志、石川玲奈、橋本敦史、田中一敏
- Industrial Property Rights Holder
  濱屋政志、石川玲奈、橋本敦史、田中一敏
- Industrial Property Rights Type
  特許
- Filing Date
  2022
- Related Report
  2021 Annual Research Report

Development of Motion Generation Technology to Realize Robots that Perform Various Tasks according to Natural Language Instructions

Principal Investigator

Hashimoto Atsushi オムロンサイニックエックス株式会社, リサーチアドミニストレイティブディビジョン, シニアリサーチャー (80641753)

¥41,990,000 (Direct Cost: ¥32,300,000、Indirect Cost: ¥9,690,000)

Report

Research Products

[Journal Article] Recipe Generation from Unsegmented Cooking Videos2024

Author(s)

Journal Title

DOI

Related Report

[Journal Article] State-aware video procedural captioning2023

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows2023

Author(s)

Journal Title

DOI

ISSN

Related Report

[Journal Article] Learning by Breaking: Food Fracture Anticipation for Robotic Food Manipulation2022

Author(s)

Journal Title

DOI

Related Report

[Journal Article] BioVL2: An Egocentric Biochemical Video-and-Language Dataset2022

Author(s)

Journal Title

DOI

ISSN

Related Report

[Presentation] Vision-Language Interpreter for Robot Task Planning2024

Author(s)

Organizer

Related Report

[Presentation] PolarDB: Formula-driven Dataset for Pre-training Trajectory Encoders2024

Author(s)

Organizer

Related Report

[Presentation] 「人と機械の融和における生成AIの社会実装」2024

Author(s)

Organizer

Related Report

[Presentation] 調理作業理解のための言語資源付き固定視点映像データセットの構築2024

Author(s)

Organizer

Related Report

[Presentation] SliceIt!--A Dual Simulator Framework for Learning Robot Food Slicing2024

Author(s)

Organizer

Related Report

[Presentation] Integrated Task and Motion Planning for Real-World Cooking Tasks2024

Author(s)

Organizer

Related Report

[Presentation] Learning Food Picking without Food: Fracture Anticipation by Breaking Reusable Fragile Objects2023

Author(s)

Organizer

Related Report

[Presentation] Deep Segmented DMP Networks for Learning Discontinuous Motions2023

Author(s)

Organizer

Related Report

[Presentation] 手の軌道特徴を用いた一人称視点料理動画における詳細動作認識2023

Author(s)

Organizer

Related Report

[Presentation] Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows2022

Author(s)

Organizer

Related Report

[Presentation] Cross-modal Representation Learning for Understanding Manufacturing Procedure2022

Author(s)

Organizer

Related Report

[Presentation] レシピ分野における動作対象の状態変化を考慮したデータセットの構築と検索モデルの提案2022

Author(s)

Organizer