A CO-EVOLUTIONARY MULTI-AGENT REINFORCEMENT LEARNING SCHEME TAKING ACCOUNT OF APPLICATION TO COMPETITIVE ENVIRONMENTS
Project/Area Number |
16500081
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | The University of Tokushima |
Principal Investigator |
ONO Norihiko THE UNIVERSITY OF TOKUSHIMA, FACULTY OF ENGINEERING, PROFESSOR, 工学部, 教授 (60194594)
|
Project Period (FY) |
2004 – 2005
|
Project Status |
Completed (Fiscal Year 2005)
|
Budget Amount *help |
¥3,700,000 (Direct Cost: ¥3,700,000)
Fiscal Year 2005: ¥1,500,000 (Direct Cost: ¥1,500,000)
Fiscal Year 2004: ¥2,200,000 (Direct Cost: ¥2,200,000)
|
Keywords | MULTI-AGENT SYSTEMS / MULTI-AGENT LEARNING / REINFORCEMENT LEARNING / MACHINE LERANING / EVOLUTIONARY COMPUTATION / NEURAL NETWORKS / CO-EVOLUTION / NEURO-EVOLUTION / 創発的設計 / 進化戦略 / 対戦型ゲ |
Research Abstract |
Several attempts have been reported to let multiple monolithic reinforcement learning (RL) agents synthesize highly coordinated behavior needed to accomplish their common goal effectively. Most of these straightforward application of RL scale poorly to more complex multi-agent learning problems, because the state space for each RL agent grows exponentially with the number of its partner agents engaged in the joint task. To cope with the exponentially large state space in multi-agent RL (MARL), we previously proposed a MARL scheme, based on neural network representation of the decision policy for an agent and its optimization with a real-coded GA, and showed the effectiveness of the scheme through its application to those multi-agent learning problems that can not be solved appropriately using any other conventional MARL frameworks. In general, however, the MARL scheme does not function in competitive environments, because we can not provide any absolute individual fitness functions in advance and accordingly we can not apply the real-coded GA. In competitive environments, such as one-on-one contests, individual fitness is evaluated through competition with other individuals, rather than through an absolute fitness measure. To remedy the drawback, we extend the MARL scheme by replacing its generation alternation model, MGG, by the co-evolutionary model, called CMGG, and allow individuals (multi-agent systems) in the population to co-evolutionarily improve their policies through competition with each other. The effectiveness of the extended MARL scheme is shown through its application to the one-on-one function approximation contest and various versions of the two-dimensional air-hockey games. The experimental results show that the extended scheme performs better over one of the best co-evolutionary generation alternation scheme proposed by Floreano and his colleagues.
|
Report
(3 results)
Research Products
(29 results)