• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Coordination of Multiple Behaviors for Competition Robots by Vision-Based Reinforcement Learning

Research Project

Project/Area Number 07455112
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field Intelligent mechanics/Mechanical systems
Research InstitutionOsaka University

Principal Investigator

ASADA Minoru  Osaka University, Faculty of Engineering, Professor, 工学部, 教授 (60151031)

Co-Investigator(Kenkyū-buntansha) SUZUKI Shoji  Osaka University, Faculty of Engineering, Professor, 工学部, 助手 (50273587)
HOSODA Koh  Osaka University, Faculty of Engineering, Professor, 工学部, 助教授 (10252610)
Project Period (FY) 1995 – 1996
Project Status Completed (Fiscal Year 1996)
Budget Amount *help
¥7,500,000 (Direct Cost: ¥7,500,000)
Fiscal Year 1996: ¥2,000,000 (Direct Cost: ¥2,000,000)
Fiscal Year 1995: ¥5,500,000 (Direct Cost: ¥5,500,000)
KeywordsReinforcement Learning / Vision / Behavior Coordination / Modular Learning / Hidden States / AIC / Mobile Robots / 多重タスク / 競合行動 / 協調行動
Research Abstract

Coordination of multiple behaviors independently obtained by the reinforcement learning method is one of the issues in order for the method to be scaled to larger and more complex robot learning tasks. Direct combination of all the state spaces for individual modules (subtasks) needs enormous learning time, and it causes hidden states. In this project, we propsed a method which accomplished a whole task consisting of plural subtasks by coordinating multiple behaviors acquired by vision-based reinforcement learning in the first year, and modified the method by introducing modular learning which coordinates multiple behaviors taking account of a trade-off between learning time and performance in the second year.
The first year :
1.Individual behaviors which achieve the corresponding subtasks were independently acquired by Q-learning.
2.Three kinds of coordinations of multiple behaviors were considered ; simple summation of different action-value functions, switching action-value functions a … More ccording to situations, and learning with previously obtained action-value funcions as initial values of a new action-value function.
3.A Task of shooting a ball into the goal avoiding collisions with an opponet was examined. The task can be decomposed into a ball shooting subtask and a collision avoiding subtask.
4.As a result, the learing method was the best one in shooting ratio, mean steps to the goal, and avoidance performance.
The second year :
1.In order to reduce the learing time the whole state space was classified into two categories based on the action values separately obtained by Q- learning : the area where one of the learned behaviors was directly applicable (no more learning area), and the area where learning was necessary due to the competition of multiple behaviors (re-learning area).
2.Hidden states are detected by model fitting to the learned action values based on the information criterion.
3.The initial action values in the re-learning area were adjusted so that they could be consistent with the values in the no more learning area.
4.The method was applied to one to one soccer playing robots, and the validity of the proposed method was shown by computer simulation and real robot experiments. Less

Report

(3 results)
  • 1996 Annual Research Report   Final Research Report Summary
  • 1995 Annual Research Report
  • Research Products

    (16 results)

All Other

All Publications (16 results)

  • [Publications] M. Asada: "Agenst that learn from other competitive agents" Proc. of Machine Learning Conference Workshop on Agents That Learn from Other Agents. 1-7 (1995)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1996 Final Research Report Summary
  • [Publications] 内部 英治: "視覚を有する移動ロボットの強化学習による複数タスクの達成" ロボティスク・メカトロニクス講演会95予稿集. 700-703 (1995)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1996 Final Research Report Summary
  • [Publications] 内部 英治: "サッカーロボットの技能学習" つくばソフトウェアシンポジウム予稿集. 43-46 (1996)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1996 Final Research Report Summary
  • [Publications] Eiji Uchibe: "Behavior coordination for a mobile robot using modular reinforcement learning" Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems 1996 (IROS'96). 1329-1336 (1996)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1996 Final Research Report Summary
  • [Publications] M.Asada: "Agents that learn from other competitive agents" Proc.of Machine Learning Conference Workshop on Agents That Learn from Other Agents. 1-7 (1995)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1996 Final Research Report Summary
  • [Publications] E.Uchibe: "Achievement of Multiple Tasks for a Mobile Robot with a Visual Sensor Using Delayd Reinforcement Learning" JSME Annual Conference on Robotics and Mechatronics (ROBOMECH '95). 1-7 (1995)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1996 Final Research Report Summary
  • [Publications] E.Uchibe: "A Purposive Behavior Acquisition for a Soccer Robot Using Reinforcement Learning" TSUKUBA Software Symposium '96. 700-708 (1995)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1996 Final Research Report Summary
  • [Publications] Eiji Uchibe: "Behavior coordination for a mobile robot using modular reinforcement learning" Proc.of IEEE/RSJ International Conference on Intelligent Robots and Systems 1996 (IROS '96). 1329-1336 (1996)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1996 Final Research Report Summary
  • [Publications] 内部 英治: "競合エージェントの存在する環境での視覚に基づく強化学習によるロボットの行動獲得" 第8回自律分散システム・シンポジウム資料. 371-374 (1996)

    • Related Report
      1996 Annual Research Report
  • [Publications] 内部 英治: "サッカーロボットの技能学習" つくばソフトウェアシンポジウム予稿集. 43-46 (1996)

    • Related Report
      1996 Annual Research Report
  • [Publications] Eiji Uchibe: "Behavior coordination for a mobile robot using modular reinforcement learning" Proc.of IEEE/RSJ International Conference on Intelligent Robots and Systems 1996 (IROS'96). 1329-1336 (1996)

    • Related Report
      1996 Annual Research Report
  • [Publications] Eiji Uchibe: "Vision-based reinforcement learning for robocup : Towards real robot competition" Proceeding of IROS-96 Workshop on RoboCup. 1329-1336 (1996)

    • Related Report
      1996 Annual Research Report
  • [Publications] Minoru Asada: "Agents that learn from other competitive agents" Proc.of Machine Learning Conference Workshop on Agents That Learn from Other Agents. 1-7 (1995)

    • Related Report
      1995 Annual Research Report
  • [Publications] 内部英治: "視覚を有する移動ロボットの強化学習による複数タスクの達成" ロボティクス・メカトロニクス講演会95予稿集. 700-703 (1995)

    • Related Report
      1995 Annual Research Report
  • [Publications] 内部英治: "他のエージェントの行動理解-サッカーロボットにおける強化学習のマルチエージェント環境への適用に向けて-" 第13回日本ロボット学会学術講演会予稿集. 241-242 (1995)

    • Related Report
      1995 Annual Research Report
  • [Publications] 内部英治: "競合エージェントの存在する環境での視覚に基づく強化学習によるロボットの行動獲得" 第8回自律分散シンポジウム予稿集. 371-374 (1996)

    • Related Report
      1995 Annual Research Report

URL: 

Published: 1995-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi