2018 Fiscal Year Final Research Report
Apprenticeship learning for heterogeneous robots
Project/Area Number |
16K16132
|
Research Category |
Grant-in-Aid for Young Scientists (B)
|
Allocation Type | Multi-year Fund |
Research Field |
Intelligent robotics
|
Research Institution | Meijo University (2017-2018) Chuo University (2016) |
Principal Investigator |
|
Project Period (FY) |
2016-04-01 – 2019-03-31
|
Keywords | 徒弟学習 / 逆強化学習 |
Outline of Final Research Achievements |
What this project pursued is to develop algorithms that transfer reward function between heterogeneous agents. Relevant inverse reinforcement learning techniques were also studied. Representative contributions of this project are as follows: 1) Inverse reinforcement algorithm assuming that an expert and agent follows non-identical Markov decision processes, or incompatible features. To represent demonstrations of expert observed in distinct feature space, a conditional density estimation technique is leveraged, and it is shown that approximation of demonstrations in agent feature can be represented in closed-form with a specific model. 2) Non-linear score-based inverse reinforcement learning, which enables us to use arbitrary trajectories, i.e. trajectories sampled from pre-learned policy of an agent, to estimate reward function.
|
Free Research Field |
知能ロボティクス
|
Academic Significance and Societal Importance of the Research Achievements |
人手で目的関数を設計することなく,観測情報に基づいてロボット単体で目的関数を構成することは,ロボットの自律性向上という意味で意義があるものと考える.現在の技術で目的関数を推定するには,何らかのお手本となるデータをロボットに観測させる必要があるが,一方で観測する対象とロボットでは身体,社会から求められる要請など,多くの差異がある.そのため,単純な模倣の枠組みでは適用可能な場面が限られる.本研究課題ではこの問題を緩和する新たな知見を提示した.
|