Project/Area Number |
22KJ2291
|
Project/Area Number (Other) |
22J14202 (2022)
|
Research Category |
Grant-in-Aid for JSPS Fellows
|
Allocation Type | Multi-year Fund (2023) Single-year Grants (2022) |
Section | 国内 |
Review Section |
Basic Section 61050:Intelligent robotics-related
|
Research Institution | Nara Institute of Science and Technology |
Principal Investigator |
Kuo ChengーYu 奈良先端科学技術大学院大学, 先端科学技術研究科, 特別研究員(DC2)
|
Project Period (FY) |
2023-03-08 – 2024-03-31
|
Project Status |
Completed (Fiscal Year 2023)
|
Budget Amount *help |
¥1,700,000 (Direct Cost: ¥1,700,000)
Fiscal Year 2023: ¥800,000 (Direct Cost: ¥800,000)
Fiscal Year 2022: ¥900,000 (Direct Cost: ¥900,000)
|
Keywords | Model-based RL / Compliant robots / Physics Energy / モデルベース強化学習 / 弾性ロボット / 二脚ロボット / 確率手法 |
Outline of Research at the Start |
This research focuses on Model-based Reinforcement Learning (MBRL) for compliant biped locomotion. MBRL is highly sample-efficient and capable of on-site learning, making it a significant advantage over model-free approaches and better-suited for modern biped robots' high complexity.
|
Outline of Annual Research Achievements |
Previously, we have been using Model-based Reinforcement Learning (MBRL) to understand the robot's preserving energy and characterize its interaction (dynamics) with its energy-flowing property based on the law of energy conservation. This year, we have made progress this year by extending our achievements and applying the MBRL approach to teach the robot how to walk on even or uneven terrain at different speeds. To achieve a higher control frequency, we introduced actuators as energy sources into the energy-flowing system to reduce the state's dimensions and achieve higher control frequency. Furthermore, we have used the robot's energy-formulated state and the law of conservation of energy to design a walking trajectory in terms of energy formulation. This ensures both robot dynamics and tasks are formulated in the same representation and a general manner. For verifying our approach's effectiveness, we conducted experiments using a simulated spring-loaded biped robot in a physics simulator. Results show that our approach can generalize across skill conditions, including different terrains and walking speeds. The walking skills are acquired using a compact 9-dimensional energy-formulated state, on-site learning ability, and learning with only a few minutes of samples. Besides developing learning algorithms, we are also creating a hardware robot that meets the requirements for our application. Regarding its progress, we are refining the initial design and plan to build the first prototype soon. With this hardware robot, we can test our approach in real-world scenarios.
|