• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Creation of a technological platform for robot reinforcement learning with safety and reliability

Research Project

Project/Area Number 21H03522
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Review Section Basic Section 61050:Intelligent robotics-related
Research InstitutionNara Institute of Science and Technology

Principal Investigator

Matsubara Takamitsu  奈良先端科学技術大学院大学, 先端科学技術研究科, 教授 (20508056)

Project Period (FY) 2021-04-01 – 2024-03-31
Project Status Completed (Fiscal Year 2023)
Budget Amount *help
¥17,420,000 (Direct Cost: ¥13,400,000、Indirect Cost: ¥4,020,000)
Fiscal Year 2023: ¥3,770,000 (Direct Cost: ¥2,900,000、Indirect Cost: ¥870,000)
Fiscal Year 2022: ¥6,110,000 (Direct Cost: ¥4,700,000、Indirect Cost: ¥1,410,000)
Fiscal Year 2021: ¥7,540,000 (Direct Cost: ¥5,800,000、Indirect Cost: ¥1,740,000)
Keywords強化学習 / 試行錯誤 / 安全性 / 信頼性 / エントロピー正則化 / 単調方策改善 / ロボットラーニング / 試行錯誤の安全性 / 学習の信頼性 / ドメインランダム化強化学習
Outline of Research at the Start

本研究の目的は、環境や道具との物理的接触を伴う作業を学習可能なロボット強化学習の技術基盤の確立である。サイバー世界を指向する現行の強化学習の枠組みでは、試行錯誤の際、不意の物理接触・衝突に対して故障・損傷を防ぐ「安全性」、学習により行動規則を確実に改善する「信頼性」が備わっていない。本研究では「安全性」と「信頼性」を備えた実世界ロボット指向の理論および技術基盤の確立を狙う。

Outline of Final Research Achievements

In this study, we introduced a reinforcement learning framework that provides the necessary safety and reliability for robots to learn physical tasks involving contact with the environment and tools. Specifically, we developed theories and algorithms to enhance safety by reducing collision risks during trial and error, and to ensure reliability by alleviating policy oscillations due to insufficient experience samples. Additionally, we applied this framework to various tasks involving physical contact using actual robots and validated its effectiveness.

Academic Significance and Societal Importance of the Research Achievements

本研究では、労働力不足の問題が深刻化する人口減少や超高齢社会において、ロボットを効果的に活用するための強化学習技術基盤を開発した。その成果により、ロボットが環境や道具との物理的接触を伴う作業を、より安全かつ効率的に学習可能なった。今後は、部品組み立てや調理など、実世界の様々な産業やサービスへの応用が期待される。この技術は、ロボットの普及と実用化を促進し、社会的にも大きな意義を持つと考えられる。

Report

(4 results)
  • 2023 Annual Research Report   Final Research Report ( PDF )
  • 2022 Annual Research Report
  • 2021 Annual Research Report
  • Research Products

    (5 results)

All 2023 2022 2021

All Journal Article (3 results) (of which Int'l Joint Research: 1 results,  Peer Reviewed: 3 results,  Open Access: 1 results) Presentation (2 results) (of which Int'l Joint Research: 2 results)

  • [Journal Article] Cyclic policy distillation: Sample-efficient sim-to-real reinforcement learning with domain randomization2023

    • Author(s)
      Kadokawa Yuki、Zhu Lingwei、Tsurumine Yoshihisa、Matsubara Takamitsu
    • Journal Title

      Robotics and Autonomous Systems

      Volume: 165 Pages: 104425-104437

    • DOI

      10.1016/j.robot.2023.104425

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Cautious policy programming: exploiting KL regularization for monotonic policy improvement in reinforcement learning2023

    • Author(s)
      Lingwei Zhu and Takamitsu Matsubara
    • Journal Title

      Machine Learning

      Volume: 112 Issue: 11 Pages: 4527-4562

    • DOI

      10.1007/s10994-023-06368-z

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Goal-aware generative adversarial imitation learning from imperfect demonstration for robotic cloth manipulation2022

    • Author(s)
      Tsurumine Yoshihisa、Matsubara Takamitsu
    • Journal Title

      Robotics and Autonomous Systems

      Volume: 158 Pages: 104264-104264

    • DOI

      10.1016/j.robot.2022.104264

    • Related Report
      2022 Annual Research Report
    • Peer Reviewed
  • [Presentation] Cautious Actor-Critic2021

    • Author(s)
      Lingwei Zhu, Toshinori Kitamura, Takamitsu Matsubara
    • Organizer
      The 13th Asian Conference on Machine Learning (ACML)
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Geometric Value Iteration: Dynamic Error-Aware KL Regularization for Reinforcement Learning2021

    • Author(s)
      Toshinori Kitamura, Lingwei Zhu, Takamitsu Matsubara
    • Organizer
      The 13th Asian Conference on Machine Learning (ACML)
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research

URL: 

Published: 2021-04-28   Modified: 2025-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi