• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2020 Fiscal Year Final Research Report

Reinforcement learning method for environment with actuators that can be modeled with first-order lag elements or dead time elements

Research Project

  • PDF
Project/Area Number 18K11424
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 61030:Intelligent informatics-related
Research InstitutionUniversity of Tsukuba

Principal Investigator

Shibuya Takeshi  筑波大学, システム情報系, 助教 (90582776)

Project Period (FY) 2018-04-01 – 2021-03-31
Keywords機械学習 / 強化学習
Outline of Final Research Achievements

In this study, the compensator was designed by the following three methods. The first method is to design the compensator to reduce the difference in the successive state caused by the presence or absence of the first-order lag element and the dead time. The second method is to design the compensator to reduce the difference in the output of the first-order lag element caused by the presence or absence of the first-order lag element. The third method is to design the extended state for the first-order lag element by a low-dimensional representation using the characteristics of the first-order lag. Numerical simulations using a two-link manipulator or an inverted pendulum were performed to confirm its effectiveness. Lastly, we studied reinforcement learning method which switches control strategy adaptively for environment conditions.

Free Research Field

機械学習

Academic Significance and Societal Importance of the Research Achievements

本研究の成果は大きく2つの学術的意義を有する。本研究の意義の1つ目は、補償器をあとから追加する方式をとる場合でもそれらの再学習を不要にできる点である。一次遅れ要素やむだ時間要素を含まない環境で学習を行い、あとからこれらを追加した環境で学習しようとする場合に生じる再学習を避けることができる。また、2つ目は、一次遅れ要素やむだ時間要素の出力値に関する情報を直接的には利用しないため、環境の情報を新たにセンシングする必要もない点である。この性質により、環境から見込んだ先を不変のものとして扱うことができる。

URL: 

Published: 2022-01-27  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi