1996 Fiscal Year Final Research Report Summary

Coordination of Multiple Behaviors for Competition Robots by Vision-Based Reinforcement Learning

Research Project

Project/Area Number	07455112
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent mechanics/Mechanical systems
Research Institution	Osaka University
Principal Investigator	ASADA Minoru Osaka University, Faculty of Engineering, Professor, 工学部, 教授 (60151031)
Co-Investigator(Kenkyū-buntansha)	SUZUKI Shoji Osaka University, Faculty of Engineering, Professor, 工学部, 助手 (50273587) HOSODA Koh Osaka University, Faculty of Engineering, Professor, 工学部, 助教授 (10252610)
Project Period (FY)	1995 – 1996
Keywords	Reinforcement Learning / Vision / Behavior Coordination / Modular Learning / Hidden States / AIC / Mobile Robots
Research Abstract	Coordination of multiple behaviors independently obtained by the reinforcement learning method is one of the issues in order for the method to be scaled to larger and more complex robot learning tasks. Direct combination of all the state spaces for individual modules (subtasks) needs enormous learning time, and it causes hidden states. In this project, we propsed a method which accomplished a whole task consisting of plural subtasks by coordinating multiple behaviors acquired by vision-based reinforcement learning in the first year, and modified the method by introducing modular learning which coordinates multiple behaviors taking account of a trade-off between learning time and performance in the second year. The first year : 1.Individual behaviors which achieve the corresponding subtasks were independently acquired by Q-learning. 2.Three kinds of coordinations of multiple behaviors were considered ; simple summation of different action-value functions, switching action-value functions a … More ccording to situations, and learning with previously obtained action-value funcions as initial values of a new action-value function. 3.A Task of shooting a ball into the goal avoiding collisions with an opponet was examined. The task can be decomposed into a ball shooting subtask and a collision avoiding subtask. 4.As a result, the learing method was the best one in shooting ratio, mean steps to the goal, and avoidance performance. The second year : 1.In order to reduce the learing time the whole state space was classified into two categories based on the action values separately obtained by Q- learning : the area where one of the learned behaviors was directly applicable (no more learning area), and the area where learning was necessary due to the competition of multiple behaviors (re-learning area). 2.Hidden states are detected by model fitting to the learned action values based on the information criterion. 3.The initial action values in the re-learning area were adjusted so that they could be consistent with the values in the no more learning area. 4.The method was applied to one to one soccer playing robots, and the validity of the proposed method was shown by computer simulation and real robot experiments. Less

Research Products
(8 results)

All Other

All Publications (8 results)

[Publications] M. Asada: "Agenst that learn from other competitive agents" Proc. of Machine Learning Conference Workshop on Agents That Learn from Other Agents. 1-7 (1995)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 内部英治: "視覚を有する移動ロボットの強化学習による複数タスクの達成" ロボティスク・メカトロニクス講演会95予稿集. 700-703 (1995)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 内部英治: "サッカーロボットの技能学習" つくばソフトウェアシンポジウム予稿集. 43-46 (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Eiji Uchibe: "Behavior coordination for a mobile robot using modular reinforcement learning" Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems 1996 (IROS'96). 1329-1336 (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] M.Asada: "Agents that learn from other competitive agents" Proc.of Machine Learning Conference Workshop on Agents That Learn from Other Agents. 1-7 (1995)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] E.Uchibe: "Achievement of Multiple Tasks for a Mobile Robot with a Visual Sensor Using Delayd Reinforcement Learning" JSME Annual Conference on Robotics and Mechatronics (ROBOMECH '95). 1-7 (1995)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] E.Uchibe: "A Purposive Behavior Acquisition for a Soccer Robot Using Reinforcement Learning" TSUKUBA Software Symposium '96. 700-708 (1995)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Eiji Uchibe: "Behavior coordination for a mobile robot using modular reinforcement learning" Proc.of IEEE/RSJ International Conference on Intelligent Robots and Systems 1996 (IROS '96). 1329-1336 (1996)
- Description
  「研究成果報告書概要(欧文)」より

1996 Fiscal Year Final Research Report Summary

Coordination of Multiple Behaviors for Competition Robots by Vision-Based Reinforcement Learning

Principal Investigator

ASADA Minoru Osaka University, Faculty of Engineering, Professor, 工学部, 教授 (60151031)

Research Products

[Publications] M. Asada: "Agenst that learn from other competitive agents" Proc. of Machine Learning Conference Workshop on Agents That Learn from Other Agents. 1-7 (1995)

Description

[Publications] 内部 英治: "視覚を有する移動ロボットの強化学習による複数タスクの達成" ロボティスク・メカトロニクス講演会95予稿集. 700-703 (1995)

Description

[Publications] 内部 英治: "サッカーロボットの技能学習" つくばソフトウェアシンポジウム予稿集. 43-46 (1996)

Description

[Publications] Eiji Uchibe: "Behavior coordination for a mobile robot using modular reinforcement learning" Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems 1996 (IROS'96). 1329-1336 (1996)

Description

[Publications] M.Asada: "Agents that learn from other competitive agents" Proc.of Machine Learning Conference Workshop on Agents That Learn from Other Agents. 1-7 (1995)

Description

[Publications] E.Uchibe: "Achievement of Multiple Tasks for a Mobile Robot with a Visual Sensor Using Delayd Reinforcement Learning" JSME Annual Conference on Robotics and Mechatronics (ROBOMECH '95). 1-7 (1995)

Description

[Publications] E.Uchibe: "A Purposive Behavior Acquisition for a Soccer Robot Using Reinforcement Learning" TSUKUBA Software Symposium '96. 700-708 (1995)

Description

[Publications] Eiji Uchibe: "Behavior coordination for a mobile robot using modular reinforcement learning" Proc.of IEEE/RSJ International Conference on Intelligent Robots and Systems 1996 (IROS '96). 1329-1336 (1996)

Description

[Publications] 内部英治: "視覚を有する移動ロボットの強化学習による複数タスクの達成" ロボティスク・メカトロニクス講演会95予稿集. 700-703 (1995)

[Publications] 内部英治: "サッカーロボットの技能学習" つくばソフトウェアシンポジウム予稿集. 43-46 (1996)