• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Apprenticeship learning for heterogeneous robots

Research Project

Project/Area Number 16K16132
Research Category

Grant-in-Aid for Young Scientists (B)

Allocation TypeMulti-year Fund
Research Field Intelligent robotics
Research InstitutionMeijo University (2017-2018)
Chuo University (2016)

Principal Investigator

Masuyama Gakuto  名城大学, 理工学部, 准教授 (20707088)

Project Period (FY) 2016-04-01 – 2019-03-31
Project Status Completed (Fiscal Year 2018)
Budget Amount *help
¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000)
Fiscal Year 2017: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2016: ¥3,120,000 (Direct Cost: ¥2,400,000、Indirect Cost: ¥720,000)
Keywords徒弟学習 / 逆強化学習 / 強化学習 / 知能ロボティックス
Outline of Final Research Achievements

What this project pursued is to develop algorithms that transfer reward function between heterogeneous agents. Relevant inverse reinforcement learning techniques were also studied. Representative contributions of this project are as follows: 1) Inverse reinforcement algorithm assuming that an expert and agent follows non-identical Markov decision processes, or incompatible features. To represent demonstrations of expert observed in distinct feature space, a conditional density estimation technique is leveraged, and it is shown that approximation of demonstrations in agent feature can be represented in closed-form with a specific model. 2) Non-linear score-based inverse reinforcement learning, which enables us to use arbitrary trajectories, i.e. trajectories sampled from pre-learned policy of an agent, to estimate reward function.

Academic Significance and Societal Importance of the Research Achievements

人手で目的関数を設計することなく,観測情報に基づいてロボット単体で目的関数を構成することは,ロボットの自律性向上という意味で意義があるものと考える.現在の技術で目的関数を推定するには,何らかのお手本となるデータをロボットに観測させる必要があるが,一方で観測する対象とロボットでは身体,社会から求められる要請など,多くの差異がある.そのため,単純な模倣の枠組みでは適用可能な場面が限られる.本研究課題ではこの問題を緩和する新たな知見を提示した.

Report

(4 results)
  • 2018 Annual Research Report   Final Research Report ( PDF )
  • 2017 Research-status Report
  • 2016 Research-status Report
  • Research Products

    (5 results)

All 2018 2017

All Presentation (5 results) (of which Int'l Joint Research: 2 results)

  • [Presentation] 軌道のスコアに基づく逆強化学習を用いた非線形な報酬関数の推定2018

    • Author(s)
      渡邉 夏美, 増山 岳人, 梅田 和昇
    • Organizer
      2018年度人工知能学会全国大会
    • Related Report
      2018 Annual Research Report
  • [Presentation] Apprenticeship Learning in an Incompatible Feature Space2017

    • Author(s)
      Gakuto Masuyama, Kazunori Umeda
    • Organizer
      2017 IEEE International Conference on Robotics and Automation (ICRA2017)
    • Related Report
      2017 Research-status Report
    • Int'l Joint Research
  • [Presentation] スコアに基づく逆強化学習のための動的計画法による軌道の自己生成2017

    • Author(s)
      渡邉 夏美, 増山 岳人, 梅田 和昇
    • Organizer
      日本機械学会ロボティクス・メカトロニクス講演会2017講演論文集
    • Related Report
      2017 Research-status Report
  • [Presentation] Apprenticeship Learning in an Incompatible Feature Space2017

    • Author(s)
      Gakuto Masuyama, Kazunori Umeda
    • Organizer
      The 2017 IEEE International Conference on Robotics and Automation
    • Place of Presentation
      Sands Expo and Convention Centre, Marina Bay Sands in Singapore
    • Related Report
      2016 Research-status Report
    • Int'l Joint Research
  • [Presentation] スコアに基づく逆強化学習のための動的計画法による軌道の自己生成2017

    • Author(s)
      渡邉夏美,増山岳人,梅田和昇
    • Organizer
      日本機械学会ロボティクス・メカトロニクス講演会2017
    • Place of Presentation
      ビッグパレットふくしま(福島県郡山市)
    • Related Report
      2016 Research-status Report

URL: 

Published: 2016-04-21   Modified: 2020-03-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi