2001 Fiscal Year Final Research Report Summary

Multi-agent Reinforcement Learning Based on Compressed Representation of Decision Policies

Research Project

Project/Area Number	12680387
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	The University of Tokushima
Principal Investigator	ONO Norihiko The University of Tokushima, Faculty of Engineering, Professor, 工学部, 教授 (60194594)
Co-Investigator(Kenkyū-buntansha)	ITO Takuya The University of Tokushima, Faculty of Engineering, Research Associate, 工学部, 助手 (50314844) ONO Isao The University of Tokushima, Faculty of Engineering, Associate Professor, 工学部, 助教授 (00304551)
Project Period (FY)	2000 – 2001
Keywords	MULTI-AGENT SYSTEMS / MULTI-AGENT REINFORCEMENT LEARNING / REINFORCEMENT LEARNING / MACHINE LEARNING / EVOLUTIONARY COMPUTING / NEURAL NETWORKS / REAL-CODED GA / OPTIMIZATION
Research Abstract	Several attempts have been reported to let multiple monolithic reinforcement learning (RL) agents synthesize highly coordinated behavior needed to accomplish their common goal effectively. Most of these straightforward application of RL scale poorly to more complex multi-agent (MA) learning problems, because the state space for each RL agent grows exponentially with the number of its partner agents engaged in the joint task. To remedy the exponentially large state space in multi-agent RL (MARL), we previously proposed a modular approach and demonstrated its effectiveness through the application to the MA learning problems. The results obtained by modular approach to MARL are encouraging, but it still has serious problems. The approach supposes: (i) all the sensory inputs and action outputs for an agent are discrete values, and (ii) all the agents make their decisions totally synchronously at regular time intervals, while such assumption does not hold in real-world multi-agent environments in general. We propose yet another MARL framework which can overcome the state space explosion in MARL, based on neural network representation of the decision policy for an agent and its optimization with a real-coded GA, which is applicable to multi-agent domains where individual agents are allowed to receive and output discrete/continuous values and to make their decisions asynchronously. To show the effectiveness of the proposed framework for real-world MARL, we have applied it to the asynchronous multi-agent seesaw balancing problem and the dynamic channel allocation problem in cellular telephone systems. The results are quite encouraging, while those problems can not be solved appropriately using any other conventional MARL frameworks.

Research Products
(12 results)

All Other

All Publications (12 results)

[Publications] Isao Ono: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents"Proc.2000 Genetic and Evolutionary Conference (GECCO 2000). 203-210 (2000)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Isao Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA using Unimodal Normal Distribution Crossover"Proc.2000 Congress on Evolutionary Computation (CEC2000). 659-666 (2000)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Yorikazu Takao: "Constructing Approximation Models Based on Agent-Based Simulations by Genetic Algorithms"Proc.Fourth International Conference on Computational Intelligence and multimedia Applications. 231-235 (2001)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 山元隆行: "非同期型マルチエージント教化学習への進化的接近"計測自動制御学会第28回知能システムシンポジウム資料. 21-26 (2001)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 中原利和: "ニューラルネット表現を用いたサッカーエージェントの行動政策の自動獲得"第45回システム制御情報学会研究発表講演会論文集. 65-66 (2001)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 問口将行: "対戦型ゲームにおける行動政策の共進化的獲得のための世代交代モデル"第46回システム制御情報学会研究発表講演会論文集. (in press). (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Isao Ono, Tetsuo Nijo and Norihiko Ono: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents"Proc. 2000 Genetic and Evolutionary Conference )GECCO2000). 203-210 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Isao Ono, Miyuki Takahashi and Norihiko Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA Using Unimodal Normal Distribution Crossover"Proc. 2000 Congress on Evolutionary Computation )CEC2000). 659-666 (2000)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Yorikazu Takao, Isao Ono and Norihiko Ono: "Constructing Approximation Models Based on Agent-Based Simulations by Genetic Algorithms"Proc. Fourth International Conference on Computational Intelligence and Multimedia Applications. 231-235 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Takayuki Yamamoto, Yoko Nakanishi, Isao Ono and Norihiko Ono: "Optimization of Asynchronous Multi-agent Systems with Real-Coded uenetic Algorithms )in Japanese)"Proc. 28th SICE Symposium on Intelligent System. 21-26 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Toshikazu Nakahara, Masayuki Maguchi, Isao Ono and Norihiko Ono: "Evolutionary Acquisition of Policies for Soccer Agents with Neural Networks )in Japanese)"Proc. 45th Annual Conference of the Institute of Systems, Control and Information Engineers (ISCIE). 65-66 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Masayuki Maguchi, Norihiko Ono and Isao Ono: "On Co-Evolutionary Acquisition of Effective Policies in Two-Player Competitive Games )in Japanese)"Proc. 46th Annual Conference of the Institute of Systems, Control and Information Engineers )ISCIE). (in press). (2002)
- Description
  「研究成果報告書概要(欧文)」より

2001 Fiscal Year Final Research Report Summary

Multi-agent Reinforcement Learning Based on Compressed Representation of Decision Policies

Principal Investigator

ONO Norihiko The University of Tokushima, Faculty of Engineering, Professor, 工学部, 教授 (60194594)

Research Products

[Publications] Isao Ono: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents"Proc.2000 Genetic and Evolutionary Conference (GECCO 2000). 203-210 (2000)

Description

[Publications] Isao Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA using Unimodal Normal Distribution Crossover"Proc.2000 Congress on Evolutionary Computation (CEC2000). 659-666 (2000)

Description

[Publications] Yorikazu Takao: "Constructing Approximation Models Based on Agent-Based Simulations by Genetic Algorithms"Proc.Fourth International Conference on Computational Intelligence and multimedia Applications. 231-235 (2001)

Description

[Publications] 山元 隆行: "非同期型マルチエージント教化学習への進化的接近"計測自動制御学会第28回知能システムシンポジウム資料. 21-26 (2001)

Description

[Publications] 中原 利和: "ニューラルネット表現を用いたサッカーエージェントの行動政策の自動獲得"第45回システム制御情報学会研究発表講演会論文集. 65-66 (2001)

Description

[Publications] 問口 将行: "対戦型ゲームにおける行動政策の共進化的獲得のための世代交代モデル"第46回システム制御情報学会研究発表講演会論文集. (in press). (2002)

Description

[Publications] Isao Ono, Tetsuo Nijo and Norihiko Ono: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents"Proc. 2000 Genetic and Evolutionary Conference )GECCO2000). 203-210 (2000)

Description

[Publications] Isao Ono, Miyuki Takahashi and Norihiko Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA Using Unimodal Normal Distribution Crossover"Proc. 2000 Congress on Evolutionary Computation )CEC2000). 659-666 (2000)

Description

[Publications] Yorikazu Takao, Isao Ono and Norihiko Ono: "Constructing Approximation Models Based on Agent-Based Simulations by Genetic Algorithms"Proc. Fourth International Conference on Computational Intelligence and Multimedia Applications. 231-235 (2001)

Description

[Publications] Takayuki Yamamoto, Yoko Nakanishi, Isao Ono and Norihiko Ono: "Optimization of Asynchronous Multi-agent Systems with Real-Coded uenetic Algorithms )in Japanese)"Proc. 28th SICE Symposium on Intelligent System. 21-26 (2001)

Description

[Publications] Toshikazu Nakahara, Masayuki Maguchi, Isao Ono and Norihiko Ono: "Evolutionary Acquisition of Policies for Soccer Agents with Neural Networks )in Japanese)"Proc. 45th Annual Conference of the Institute of Systems, Control and Information Engineers (ISCIE). 65-66 (2001)

Description

[Publications] Masayuki Maguchi, Norihiko Ono and Isao Ono: "On Co-Evolutionary Acquisition of Effective Policies in Two-Player Competitive Games )in Japanese)"Proc. 46th Annual Conference of the Institute of Systems, Control and Information Engineers )ISCIE). (in press). (2002)

Description

[Publications] 山元隆行: "非同期型マルチエージント教化学習への進化的接近"計測自動制御学会第28回知能システムシンポジウム資料. 21-26 (2001)

[Publications] 中原利和: "ニューラルネット表現を用いたサッカーエージェントの行動政策の自動獲得"第45回システム制御情報学会研究発表講演会論文集. 65-66 (2001)

[Publications] 問口将行: "対戦型ゲームにおける行動政策の共進化的獲得のための世代交代モデル"第46回システム制御情報学会研究発表講演会論文集. (in press). (2002)