Synthesis of Coordinated Behavior by Autonomous Agents

Research Project

Project/Area Number	10680384
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	The University of Tokushima
Principal Investigator	ONO Norihiko THE UNIVERSITY OF TOKUSHIMA, FACULTY OF ENGINEERING, PROFESSOR, 工学部, 教授 (60194594)
Co-Investigator(Kenkyū-buntansha)	ITO Takuya THE UNIVERSITY OF TOKUSHIMA, FACULTY OF ENGINEERING, RESEARCH ASSOCIATE, 工学部, 助手 (50314844) ONO Isao THE UNIVERSITY OF TOKUSHIMA, FACULTY OF ENGINEERING, LECTURER, 工学部, 講師 (00304551)
Project Period (FY)	1998 – 1999
Project Status	Completed (Fiscal Year 1999)
Budget Amount *help	¥1,000,000 (Direct Cost: ¥1,000,000) Fiscal Year 1999: ¥1,000,000 (Direct Cost: ¥1,000,000)
Keywords	MULTI-AGENT SYSTEMS / REINFORCEMENT LEARNING / EVOLUTIONARY ALGORITHMS / MULTI-AGENT LEARNING / ARTIFICIAL INTLLIGENCE / COORDINATED BEHAVIOR / AUTONOMOUS AGENTS / マルチエージェン強化学習 / 分散人工知能 / 創発
Research Abstract	Several attempts have been reported to let multiple monolithic reinforcement learning (RL) agents synthesize highly coordinated behavior needed to accomplish their common goal effectively. Most of these straightforward application of RL scale poorly to more complex multi-agent (MA) learning problems, because the state space for each RL agent grows exponentially with the number of its partner agents engaged in the joint task. To remedy the exponentially large state space in multi-agent RL (MARL), we previously proposed a modular approach and demonstrated its effectiveness through the application to the MA learning problems. The results obtained by modular approach to MARL are encouraging, but it still has a serious problem. The performance of modular RL agents strongly depends on their modular structures, and hence we have to design appropriate structures for the agents. However, it is extremely difficult for us to identify such structures in a top-down manner, because we are not able to … More correctly predict the performance of a given MA systems, which consists of multiple modular RL agents and accordingly is of substantially complexity with respect to both its structure and its functionality. This means that we have to identify appropriate modular structures for the agents by trial and error. To overcome this problem, we have to establish a framework for automatically synthesizing appropriate modular structures for the agents. We suppose that a collection of multiple homogeneous modular RL agents are engaged in a joint task, aimed at the accomplishment of their common goal, and they have the same modular structure in common. We proposed a framework for identifying an appropriate modular structure for the agents, which begins with a randomly generated structure, and attempts to incrementally improve it. A modular structure is represented by a set of a variable number of learning modules, and is evaluated based on the performance of those RL agents employing the structure. The modular structure is improved using a kind of hill-climbing scheme. A set of simple operators is devised, each generating a neighborhood of the current structure. To show the effectiveness of the proposed framework, we applied it to a multi-agent learning problem, called the Simulated Dodgeball Game-II and attempted to identify an appropriate modular structure for the attacker agents, each implemented by an independent but homogeneous modular RL architecture. A modular structure is evaluated based on the performance of those attacker agents employing the structure. The results are quite encouraging. Using this framework, for example, we always identified a modular structure which substantially outperforms those manually designed by a human expert. Less

Report

(3 results)

1999 Annual Research Report Final Research Report Summary
1998 Annual Research Report

Research Products
(31 results)

All Other

All Publications (31 results)

[Publications] N.Ono and S.Yoshida: "Synthesis of Coordinated Behavior in Simulated Dodgeball Games"Proceedings of lnternational Conference on Intelligent Autonomous Systems 5. 663-668 (1998)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] N.Ono and S.Yoshida: "Synthetic Collective Behavior by Multiple Reinforcement Leaning Agents in Simulated Dodgeball Game"Proceedings of the Fourth lnternatiorlal Conference on Artificial Life and Robotics(AROB 4th '99). 540-543 (1999)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] H.Fujiki,I.Ono and N.Oho: "A Reinforcement Leaning Scheme based on Decision Tree Representation of State Space and Its Genetic Acquisition"Proceedings of the 5th lnternational Symposium on Artificial Life and Robotics(AROB 5th '00). 563-567 (2000)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] T.Nijo,I.Ono and N.Ono: "Evolution of Modular Structures for Multiple Reinforcement Leaning Agents"Proceedings of the 5th lnternational Symposium on Artificial Life and Robotics(AROB 5th '00). 576-579 (2000)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] I.Ono,T.Nijo and N.Ono: "A Genetic Algorithm Automatically Designing Modular Reinforcement Leaning Agents"Proceedings of the 2000 Genetic and Evolutionary Computation Conference(GECCO-2000). (印刷中). (2000)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] I.Ono,M.Takahashi and N.Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA using the Unimodal Normal Distribution Crossover"Proceedings of the 2000 Congress on Evolutionary Comptutation (CEC2000). (印刷中). (2000)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] T.Ilo,H.Iba and S.Sato: "Advances in Genetic Proglamming,Vol.3(一部執筆)"The MIT Press. 476 (1999)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] N. Ono and S. Yoshida: "Synthesis of Coordinated Behavior in Simulated Dodgeball Games."Proceedings of International Conference on Intelligent Autonomous Systems. 5. 663-668 (1998)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] S. Yoshida and N. Ono: "Multi-agent Reinforcement Learning via Decomposed State Representation and Structural Credit Assignments"Proceedings of the 42ィイD1ndィエD1 Annual Conference of the Institute of Systems, Control and Information Engineers. 35-36 (1998)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] S. Yoshida and N. Ono: "Multi-agent Reinforcement Learning via Decomposed State Representation."Proceedings of the SICE 26ィイD1thィエD1 Annual Symposium on Intelligent Systems. 163-168 (1998)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] H. Fujiki, Isao Ono and N. Ono: "Decision Tree Representation of State Space for Reinforcement Learning Agents and Its Acquisition based on Genetic Programming."Proceedings of the 43ィイD1rdィエD1 Annual Conference of the Institute of Systems, Control and Information Engineers. 25-26 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] T. Nijo, Isao Ono and N. Ono: "Evolutionary Acquisition of Agent Structures in Modular Reinforcement Learning."Proceedings of the 43ィイD1rdィエD1 Annual Conference of the Institute of Systems, Control and Information Engineers. 27-28 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] N. Ono and S. Yoshida: "Synthetic Collective Behavior by Multiple Reinforcement Learning Agents in Simulated Dodgeball Game."Proceedings of the Fourth International Symposium on Artificial Life and Robotics (AROB 4th '99). 540-543 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] M. Doi. Isao Ono, N. Ono. H. Kimura and S. Kobayashi: "Discussion on Application of Reinforcement Learning in a Real Environment."Proceedings of FAN Symposium '99 in Fukui. 133-138 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] M. Takahashi, Isao Ono and N. Ono: "Evolutionary Acquisition of Policies in an Environment with Delayed Reward."Proceedings of FAN Symposium '99 in Fukui. 291-296 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] H. Fujiki, Isao Ono and N. Ono: "A Reinforcement Learning Scheme based on Decision Tree Representation of State Space and Its Genetic Acquisition."Proceedings of the Fifth International Symposium on Artificial Life and Robotics (AROB 5th '00). 563-567 (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] T. Nijo, Isao Ono and N. Ono: "Evolution of Modular Structures for Multiple Reinforcement Learning Agents."Proceedings of the Fifth International Symposium on Artificial Life and Robotics (AROB 5th '00). 576-579 (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] I. Ono, T. Nijo and N. Ono: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents."Proceedings of the 2000 Genetic and Evolutionary Computation Conference (GECCO-2000). (to appear).
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] I. Ono, M. Takahashi and N. Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA using the Unimodal Normal Distribution Crossover."Proceedings of the 2000 Congress on Evolutionary Computation (CEC2000). (to appear). (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] N.Nijo,I.Ono and N.Ono: "Evolution of Modular Structures for Multiple Reinforcement Learning Agents"Proc.5th International Symposium in Artificial Life and Robotics. 576-579 (2000)
- Related Report
  1999 Annual Research Report
[Publications] H.Fujiki,I.Ono and N.Ono: "A Reinforcement Learning Scheme based on Decision Tree Representation of State Space and Its Genetic Auquisition"Proc.5th International Symposium in Artificial Life and Robotics. 563-567 (2000)
- Related Report
  1999 Annual Research Report
[Publications] I.Ono,T.Nijo and N.Ono: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents"Proc.GECCO-2000. (発表予定). (2000)
- Related Report
  1999 Annual Research Report
[Publications] 土井幹也,小野功,小野典彦: "実環境の強化学習の適用に関する実験的考察"FAN Symposium '99 講演論文集. 133-138 (1999)
- Related Report
  1999 Annual Research Report
[Publications] 高橋みゆき,小野功,小野典彦: "報酬に遅れのある環境における行動政策の進化的獲得"FAN Symposium '99 講演論文集. 291-296 (1999)
- Related Report
  1999 Annual Research Report
[Publications] T.Ito,H.Iba and S.Sato: "Advances in Genetic Programming,Vol.3 (一部執筆)"The MIT Press. 476 (1999)
- Related Report
  1999 Annual Research Report
[Publications] N.Ono,S.Yoshida: "Sgnthesis of Coordinated Behavior in Simulated Dodgehall Games" Proceedings of International Conterence on Intelligent Autonomous Systems (IAS-5). 663-668 (1998)
- Related Report
  1998 Annual Research Report
[Publications] N.Ono,S.Yoshida: "Synthetic Collective Behavior by Multiple Reinforcement Ledrning Agents in Simulated Dodgeball Game" Proceedings of the Fourth International Symposium on Artificial Life and Robotics. 540-543 (1999)
- Related Report
  1998 Annual Research Report
[Publications] I.Ono,S.Kobayashi,K.Yoshida: "A Genetic Algorithm Taking Account of Characteristics Preseruation for Job Shop Scheduling Problems" Proceeding of International Conference on Intelligent Autonomous systems(IAS-5). 711-718 (1998)
- Related Report
  1998 Annual Research Report
[Publications] I.Ono,S.Kobayashi,K.Yoshida: "Global and Multi-objective Optimization for Lens Design by Real-coded Genetic Algorithms." Proceedings of International Optical Design Conference. (1998)
- Related Report
  1998 Annual Research Report
[Publications] 吉田伸一郎、小野功、小野典彦: "状態空間の圧縮表現に基づくマルチェージェント強化学習" 計測自動制御学会第26回知能システムシンポジウム資料. (1999)
- Related Report
  1998 Annual Research Report
[Publications] 小野功、喜多一、小林重信: "変数間の依存関係を考慮したGAによる連続変数と離散変数を含む関数の最適化" 計測自動制御学会第26回知能システムシンポジウム資料. (1999)
- Related Report
  1998 Annual Research Report

Synthesis of Coordinated Behavior by Autonomous Agents

Principal Investigator

ONO Norihiko THE UNIVERSITY OF TOKUSHIMA, FACULTY OF ENGINEERING, PROFESSOR, 工学部, 教授 (60194594)

¥1,000,000 (Direct Cost: ¥1,000,000)

Report

Research Products

[Publications] N.Ono and S.Yoshida: "Synthesis of Coordinated Behavior in Simulated Dodgeball Games"Proceedings of lnternational Conference on Intelligent Autonomous Systems 5. 663-668 (1998)

Description

Related Report

[Publications] N.Ono and S.Yoshida: "Synthetic Collective Behavior by Multiple Reinforcement Leaning Agents in Simulated Dodgeball Game"Proceedings of the Fourth lnternatiorlal Conference on Artificial Life and Robotics(AROB 4th '99). 540-543 (1999)

Description

Related Report

[Publications] H.Fujiki,I.Ono and N.Oho: "A Reinforcement Leaning Scheme based on Decision Tree Representation of State Space and Its Genetic Acquisition"Proceedings of the 5th lnternational Symposium on Artificial Life and Robotics(AROB 5th '00). 563-567 (2000)

Description

Related Report

[Publications] T.Nijo,I.Ono and N.Ono: "Evolution of Modular Structures for Multiple Reinforcement Leaning Agents"Proceedings of the 5th lnternational Symposium on Artificial Life and Robotics(AROB 5th '00). 576-579 (2000)

Description

Related Report

[Publications] I.Ono,T.Nijo and N.Ono: "A Genetic Algorithm Automatically Designing Modular Reinforcement Leaning Agents"Proceedings of the 2000 Genetic and Evolutionary Computation Conference(GECCO-2000). (印刷中). (2000)

Description

Related Report

[Publications] I.Ono,M.Takahashi and N.Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA using the Unimodal Normal Distribution Crossover"Proceedings of the 2000 Congress on Evolutionary Comptutation (CEC2000). (印刷中). (2000)

Description

Related Report

[Publications] T.Ilo,H.Iba and S.Sato: "Advances in Genetic Proglamming,Vol.3(一部執筆)"The MIT Press. 476 (1999)

Description

Related Report

[Publications] N. Ono and S. Yoshida: "Synthesis of Coordinated Behavior in Simulated Dodgeball Games."Proceedings of International Conference on Intelligent Autonomous Systems. 5. 663-668 (1998)

Description

Related Report

[Publications] S. Yoshida and N. Ono: "Multi-agent Reinforcement Learning via Decomposed State Representation and Structural Credit Assignments"Proceedings of the 42ィイD1ndィエD1 Annual Conference of the Institute of Systems, Control and Information Engineers. 35-36 (1998)

Description

Related Report

[Publications] S. Yoshida and N. Ono: "Multi-agent Reinforcement Learning via Decomposed State Representation."Proceedings of the SICE 26ィイD1thィエD1 Annual Symposium on Intelligent Systems. 163-168 (1998)

Description

Related Report

Description

Related Report

[Publications] T. Nijo, Isao Ono and N. Ono: "Evolutionary Acquisition of Agent Structures in Modular Reinforcement Learning."Proceedings of the 43ィイD1rdィエD1 Annual Conference of the Institute of Systems, Control and Information Engineers. 27-28 (1999)

Description

Related Report

[Publications] N. Ono and S. Yoshida: "Synthetic Collective Behavior by Multiple Reinforcement Learning Agents in Simulated Dodgeball Game."Proceedings of the Fourth International Symposium on Artificial Life and Robotics (AROB 4th '99). 540-543 (1999)

Description

Related Report

[Publications] M. Doi. Isao Ono, N. Ono. H. Kimura and S. Kobayashi: "Discussion on Application of Reinforcement Learning in a Real Environment."Proceedings of FAN Symposium '99 in Fukui. 133-138 (1999)

Description

Related Report

[Publications] M. Takahashi, Isao Ono and N. Ono: "Evolutionary Acquisition of Policies in an Environment with Delayed Reward."Proceedings of FAN Symposium '99 in Fukui. 291-296 (1999)

Description

Related Report

[Publications] H. Fujiki, Isao Ono and N. Ono: "A Reinforcement Learning Scheme based on Decision Tree Representation of State Space and Its Genetic Acquisition."Proceedings of the Fifth International Symposium on Artificial Life and Robotics (AROB 5th '00). 563-567 (2000)

Description

Related Report

[Publications] T. Nijo, Isao Ono and N. Ono: "Evolution of Modular Structures for Multiple Reinforcement Learning Agents."Proceedings of the Fifth International Symposium on Artificial Life and Robotics (AROB 5th '00). 576-579 (2000)

Description

Related Report

[Publications] I. Ono, T. Nijo and N. Ono: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents."Proceedings of the 2000 Genetic and Evolutionary Computation Conference (GECCO-2000). (to appear).

Description

Related Report

[Publications] I. Ono, M. Takahashi and N. Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA using the Unimodal Normal Distribution Crossover."Proceedings of the 2000 Congress on Evolutionary Computation (CEC2000). (to appear). (2000)

Description

Related Report

[Publications] N.Nijo,I.Ono and N.Ono: "Evolution of Modular Structures for Multiple Reinforcement Learning Agents"Proc.5th International Symposium in Artificial Life and Robotics. 576-579 (2000)

Related Report

[Publications] H.Fujiki,I.Ono and N.Ono: "A Reinforcement Learning Scheme based on Decision Tree Representation of State Space and Its Genetic Auquisition"Proc.5th International Symposium in Artificial Life and Robotics. 563-567 (2000)

Related Report

[Publications] I.Ono,T.Nijo and N.Ono: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents"Proc.GECCO-2000. (発表予定). (2000)

Related Report

[Publications] 土井幹也,小野功,小野典彦: "実環境の強化学習の適用に関する実験的考察"FAN Symposium '99 講演論文集. 133-138 (1999)

Related Report

[Publications] 高橋みゆき,小野功,小野典彦: "報酬に遅れのある環境における行動政策の進化的獲得"FAN Symposium '99 講演論文集. 291-296 (1999)

Related Report

[Publications] T.Ito,H.Iba and S.Sato: "Advances in Genetic Programming,Vol.3 (一部執筆)"The MIT Press. 476 (1999)

Related Report

[Publications] N.Ono,S.Yoshida: "Sgnthesis of Coordinated Behavior in Simulated Dodgehall Games" Proceedings of International Conterence on Intelligent Autonomous Systems (IAS-5). 663-668 (1998)

Related Report

[Publications] N.Ono,S.Yoshida: "Synthetic Collective Behavior by Multiple Reinforcement Ledrning Agents in Simulated Dodgeball Game" Proceedings of the Fourth International Symposium on Artificial Life and Robotics. 540-543 (1999)

Related Report

[Publications] I.Ono,S.Kobayashi,K.Yoshida: "A Genetic Algorithm Taking Account of Characteristics Preseruation for Job Shop Scheduling Problems" Proceeding of International Conference on Intelligent Autonomous systems(IAS-5). 711-718 (1998)

Related Report