1999 Fiscal Year Final Research Report Summary

Synthesis of Coordinated Behavior by Autonomous Agents

Research Project

Project/Area Number	10680384
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	The University of Tokushima
Principal Investigator	ONO Norihiko THE UNIVERSITY OF TOKUSHIMA, FACULTY OF ENGINEERING, PROFESSOR, 工学部, 教授 (60194594)
Co-Investigator(Kenkyū-buntansha)	ITO Takuya THE UNIVERSITY OF TOKUSHIMA, FACULTY OF ENGINEERING, RESEARCH ASSOCIATE, 工学部, 助手 (50314844) ONO Isao THE UNIVERSITY OF TOKUSHIMA, FACULTY OF ENGINEERING, LECTURER, 工学部, 講師 (00304551)
Project Period (FY)	1998 – 1999
Keywords	MULTI-AGENT SYSTEMS / REINFORCEMENT LEARNING / EVOLUTIONARY ALGORITHMS / MULTI-AGENT LEARNING / ARTIFICIAL INTLLIGENCE / COORDINATED BEHAVIOR / AUTONOMOUS AGENTS
Research Abstract	Several attempts have been reported to let multiple monolithic reinforcement learning (RL) agents synthesize highly coordinated behavior needed to accomplish their common goal effectively. Most of these straightforward application of RL scale poorly to more complex multi-agent (MA) learning problems, because the state space for each RL agent grows exponentially with the number of its partner agents engaged in the joint task. To remedy the exponentially large state space in multi-agent RL (MARL), we previously proposed a modular approach and demonstrated its effectiveness through the application to the MA learning problems. The results obtained by modular approach to MARL are encouraging, but it still has a serious problem. The performance of modular RL agents strongly depends on their modular structures, and hence we have to design appropriate structures for the agents. However, it is extremely difficult for us to identify such structures in a top-down manner, because we are not able to … More correctly predict the performance of a given MA systems, which consists of multiple modular RL agents and accordingly is of substantially complexity with respect to both its structure and its functionality. This means that we have to identify appropriate modular structures for the agents by trial and error. To overcome this problem, we have to establish a framework for automatically synthesizing appropriate modular structures for the agents. We suppose that a collection of multiple homogeneous modular RL agents are engaged in a joint task, aimed at the accomplishment of their common goal, and they have the same modular structure in common. We proposed a framework for identifying an appropriate modular structure for the agents, which begins with a randomly generated structure, and attempts to incrementally improve it. A modular structure is represented by a set of a variable number of learning modules, and is evaluated based on the performance of those RL agents employing the structure. The modular structure is improved using a kind of hill-climbing scheme. A set of simple operators is devised, each generating a neighborhood of the current structure. To show the effectiveness of the proposed framework, we applied it to a multi-agent learning problem, called the Simulated Dodgeball Game-II and attempted to identify an appropriate modular structure for the attacker agents, each implemented by an independent but homogeneous modular RL architecture. A modular structure is evaluated based on the performance of those attacker agents employing the structure. The results are quite encouraging. Using this framework, for example, we always identified a modular structure which substantially outperforms those manually designed by a human expert. Less

Research Products
(19 results)

All Other

All Publications (19 results)

[Publications] N.Ono and S.Yoshida: "Synthesis of Coordinated Behavior in Simulated Dodgeball Games"Proceedings of lnternational Conference on Intelligent Autonomous Systems 5. 663-668 (1998)
- Description
  「研究成果報告書概要(和文)」より
[Publications] N.Ono and S.Yoshida: "Synthetic Collective Behavior by Multiple Reinforcement Leaning Agents in Simulated Dodgeball Game"Proceedings of the Fourth lnternatiorlal Conference on Artificial Life and Robotics(AROB 4th '99). 540-543 (1999)
- Description
  「研究成果報告書概要(和文)」より
[Publications] H.Fujiki,I.Ono and N.Oho: "A Reinforcement Leaning Scheme based on Decision Tree Representation of State Space and Its Genetic Acquisition"Proceedings of the 5th lnternational Symposium on Artificial Life and Robotics(AROB 5th '00). 563-567 (2000)
- Description
  「研究成果報告書概要(和文)」より
[Publications] T.Nijo,I.Ono and N.Ono: "Evolution of Modular Structures for Multiple Reinforcement Leaning Agents"Proceedings of the 5th lnternational Symposium on Artificial Life and Robotics(AROB 5th '00). 576-579 (2000)
- Description
  「研究成果報告書概要(和文)」より
[Publications] I.Ono,T.Nijo and N.Ono: "A Genetic Algorithm Automatically Designing Modular Reinforcement Leaning Agents"Proceedings of the 2000 Genetic and Evolutionary Computation Conference(GECCO-2000). (印刷中). (2000)
- Description
  「研究成果報告書概要(和文)」より
[Publications] I.Ono,M.Takahashi and N.Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA using the Unimodal Normal Distribution Crossover"Proceedings of the 2000 Congress on Evolutionary Comptutation (CEC2000). (印刷中). (2000)
- Description
  「研究成果報告書概要(和文)」より
[Publications] T.Ilo,H.Iba and S.Sato: "Advances in Genetic Proglamming,Vol.3(一部執筆)"The MIT Press. 476 (1999)
- Description
  「研究成果報告書概要(和文)」より
[Publications] N. Ono and S. Yoshida: "Synthesis of Coordinated Behavior in Simulated Dodgeball Games."Proceedings of International Conference on Intelligent Autonomous Systems. 5. 663-668 (1998)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] S. Yoshida and N. Ono: "Multi-agent Reinforcement Learning via Decomposed State Representation and Structural Credit Assignments"Proceedings of the 42ィイD1ndィエD1 Annual Conference of the Institute of Systems, Control and Information Engineers. 35-36 (1998)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] S. Yoshida and N. Ono: "Multi-agent Reinforcement Learning via Decomposed State Representation."Proceedings of the SICE 26ィイD1thィエD1 Annual Symposium on Intelligent Systems. 163-168 (1998)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] H. Fujiki, Isao Ono and N. Ono: "Decision Tree Representation of State Space for Reinforcement Learning Agents and Its Acquisition based on Genetic Programming."Proceedings of the 43ィイD1rdィエD1 Annual Conference of the Institute of Systems, Control and Information Engineers. 25-26 (1999)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] T. Nijo, Isao Ono and N. Ono: "Evolutionary Acquisition of Agent Structures in Modular Reinforcement Learning."Proceedings of the 43ィイD1rdィエD1 Annual Conference of the Institute of Systems, Control and Information Engineers. 27-28 (1999)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] N. Ono and S. Yoshida: "Synthetic Collective Behavior by Multiple Reinforcement Learning Agents in Simulated Dodgeball Game."Proceedings of the Fourth International Symposium on Artificial Life and Robotics (AROB 4th '99). 540-543 (1999)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] M. Doi. Isao Ono, N. Ono. H. Kimura and S. Kobayashi: "Discussion on Application of Reinforcement Learning in a Real Environment."Proceedings of FAN Symposium '99 in Fukui. 133-138 (1999)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] M. Takahashi, Isao Ono and N. Ono: "Evolutionary Acquisition of Policies in an Environment with Delayed Reward."Proceedings of FAN Symposium '99 in Fukui. 291-296 (1999)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] H. Fujiki, Isao Ono and N. Ono: "A Reinforcement Learning Scheme based on Decision Tree Representation of State Space and Its Genetic Acquisition."Proceedings of the Fifth International Symposium on Artificial Life and Robotics (AROB 5th '00). 563-567 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] T. Nijo, Isao Ono and N. Ono: "Evolution of Modular Structures for Multiple Reinforcement Learning Agents."Proceedings of the Fifth International Symposium on Artificial Life and Robotics (AROB 5th '00). 576-579 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] I. Ono, T. Nijo and N. Ono: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents."Proceedings of the 2000 Genetic and Evolutionary Computation Conference (GECCO-2000). (to appear).
- Description
  「研究成果報告書概要(欧文)」より
[Publications] I. Ono, M. Takahashi and N. Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA using the Unimodal Normal Distribution Crossover."Proceedings of the 2000 Congress on Evolutionary Computation (CEC2000). (to appear). (2000)
- Description
  「研究成果報告書概要(欧文)」より

1999 Fiscal Year Final Research Report Summary

Synthesis of Coordinated Behavior by Autonomous Agents

Principal Investigator

ONO Norihiko THE UNIVERSITY OF TOKUSHIMA, FACULTY OF ENGINEERING, PROFESSOR, 工学部, 教授 (60194594)

Research Products

[Publications] N.Ono and S.Yoshida: "Synthesis of Coordinated Behavior in Simulated Dodgeball Games"Proceedings of lnternational Conference on Intelligent Autonomous Systems 5. 663-668 (1998)

Description

[Publications] N.Ono and S.Yoshida: "Synthetic Collective Behavior by Multiple Reinforcement Leaning Agents in Simulated Dodgeball Game"Proceedings of the Fourth lnternatiorlal Conference on Artificial Life and Robotics(AROB 4th '99). 540-543 (1999)

Description

[Publications] H.Fujiki,I.Ono and N.Oho: "A Reinforcement Leaning Scheme based on Decision Tree Representation of State Space and Its Genetic Acquisition"Proceedings of the 5th lnternational Symposium on Artificial Life and Robotics(AROB 5th '00). 563-567 (2000)

Description

[Publications] T.Nijo,I.Ono and N.Ono: "Evolution of Modular Structures for Multiple Reinforcement Leaning Agents"Proceedings of the 5th lnternational Symposium on Artificial Life and Robotics(AROB 5th '00). 576-579 (2000)

Description

[Publications] I.Ono,T.Nijo and N.Ono: "A Genetic Algorithm Automatically Designing Modular Reinforcement Leaning Agents"Proceedings of the 2000 Genetic and Evolutionary Computation Conference(GECCO-2000). (印刷中). (2000)

Description

[Publications] I.Ono,M.Takahashi and N.Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA using the Unimodal Normal Distribution Crossover"Proceedings of the 2000 Congress on Evolutionary Comptutation (CEC2000). (印刷中). (2000)

Description

[Publications] T.Ilo,H.Iba and S.Sato: "Advances in Genetic Proglamming,Vol.3(一部執筆)"The MIT Press. 476 (1999)

Description

[Publications] N. Ono and S. Yoshida: "Synthesis of Coordinated Behavior in Simulated Dodgeball Games."Proceedings of International Conference on Intelligent Autonomous Systems. 5. 663-668 (1998)

Description

[Publications] S. Yoshida and N. Ono: "Multi-agent Reinforcement Learning via Decomposed State Representation and Structural Credit Assignments"Proceedings of the 42ィイD1ndィエD1 Annual Conference of the Institute of Systems, Control and Information Engineers. 35-36 (1998)

Description

[Publications] S. Yoshida and N. Ono: "Multi-agent Reinforcement Learning via Decomposed State Representation."Proceedings of the SICE 26ィイD1thィエD1 Annual Symposium on Intelligent Systems. 163-168 (1998)

Description

Description

[Publications] T. Nijo, Isao Ono and N. Ono: "Evolutionary Acquisition of Agent Structures in Modular Reinforcement Learning."Proceedings of the 43ィイD1rdィエD1 Annual Conference of the Institute of Systems, Control and Information Engineers. 27-28 (1999)

Description

[Publications] N. Ono and S. Yoshida: "Synthetic Collective Behavior by Multiple Reinforcement Learning Agents in Simulated Dodgeball Game."Proceedings of the Fourth International Symposium on Artificial Life and Robotics (AROB 4th '99). 540-543 (1999)

Description

[Publications] M. Doi. Isao Ono, N. Ono. H. Kimura and S. Kobayashi: "Discussion on Application of Reinforcement Learning in a Real Environment."Proceedings of FAN Symposium '99 in Fukui. 133-138 (1999)

Description

[Publications] M. Takahashi, Isao Ono and N. Ono: "Evolutionary Acquisition of Policies in an Environment with Delayed Reward."Proceedings of FAN Symposium '99 in Fukui. 291-296 (1999)

Description

[Publications] H. Fujiki, Isao Ono and N. Ono: "A Reinforcement Learning Scheme based on Decision Tree Representation of State Space and Its Genetic Acquisition."Proceedings of the Fifth International Symposium on Artificial Life and Robotics (AROB 5th '00). 563-567 (2000)

Description

[Publications] T. Nijo, Isao Ono and N. Ono: "Evolution of Modular Structures for Multiple Reinforcement Learning Agents."Proceedings of the Fifth International Symposium on Artificial Life and Robotics (AROB 5th '00). 576-579 (2000)

Description

[Publications] I. Ono, T. Nijo and N. Ono: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents."Proceedings of the 2000 Genetic and Evolutionary Computation Conference (GECCO-2000). (to appear).

Description

[Publications] I. Ono, M. Takahashi and N. Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA using the Unimodal Normal Distribution Crossover."Proceedings of the 2000 Congress on Evolutionary Computation (CEC2000). (to appear). (2000)

Description