2021 Fiscal Year Final Research Report

Data-driven quasi-optimal control using machine learning techniques

Research Project

PDF

Project/Area Number	19K20375
Research Category	Grant-in-Aid for Early-Career Scientists
Allocation Type	Multi-year Fund
Review Section	Basic Section 61050:Intelligent robotics-related
Research Institution	Nagoya University
Principal Investigator	Ariizumi Ryo 名古屋大学, 工学研究科, 助教 (30775143)
Project Period (FY)	2019-04-01 – 2022-03-31
Keywords	強化学習 / ロボティクス / 制御工学
Outline of Final Research Achievements	In this research, we aimed to propose reinforce learning methods that can obtain sub-optimal inputs (actions) with a relatively small number of samples. Especially, we put our attention on the PI2 algorithm, which is known to be efficient for robots with large degrees of freedom. One of our proposed algorithms achieves a standing-up motion of a legged robot, which is turned over at the initial state. This task is very difficult for most existing methods, but our method succeeded by using a few thousand samples. We also conduct a basic study to employ control-theoretic methods to speed-up reinforcement learning.
Free Research Field	ロボティクス
Academic Significance and Societal Importance of the Research Achievements	強化学習の有効性は様々な分野で明らかになってきているが，多自由度ロボットの強化学習は状態や入力が連続値であることもあり，タスクによっては数十万回に及ぶ実験が必要となるなど，まだ実用に足る効率は発揮できていない．本研究ではデータ効率の向上を目的に，データの使い方の工夫を提案した．また，データの工夫だけでは効率化に限界がある．そこで，明らかに成立する物理的性質を学習に取り入れることを考え，その実現のための基礎的検討を行った．これらは，今後さらに強化学習の効率を向上させ，多自由度ロボットの強化学習のデータ効率を実用的なレベルに引き上げるための基礎となりうる．