Multi-agent Reinforcement Learning Based on Compressed Representation of Decision Policies

Research Project

Project/Area Number	12680387
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	The University of Tokushima
Principal Investigator	ONO Norihiko The University of Tokushima, Faculty of Engineering, Professor, 工学部, 教授 (60194594)
Co-Investigator(Kenkyū-buntansha)	ITO Takuya The University of Tokushima, Faculty of Engineering, Research Associate, 工学部, 助手 (50314844) ONO Isao The University of Tokushima, Faculty of Engineering, Associate Professor, 工学部, 助教授 (00304551)
Project Period (FY)	2000 – 2001
Project Status	Completed (Fiscal Year 2001)
Budget Amount *help	¥3,600,000 (Direct Cost: ¥3,600,000) Fiscal Year 2001: ¥1,400,000 (Direct Cost: ¥1,400,000) Fiscal Year 2000: ¥2,200,000 (Direct Cost: ¥2,200,000)
Keywords	MULTI-AGENT SYSTEMS / MULTI-AGENT REINFORCEMENT LEARNING / REINFORCEMENT LEARNING / MACHINE LEARNING / EVOLUTIONARY COMPUTING / NEURAL NETWORKS / REAL-CODED GA / OPTIMIZATION / 共進化 / 世代交代モデル / 自律エージェント / 進化的計算 / ニューラルネット / 人工知能 / 分散人工知能 / 実数値遺伝的アルゴリズム
Research Abstract	Several attempts have been reported to let multiple monolithic reinforcement learning (RL) agents synthesize highly coordinated behavior needed to accomplish their common goal effectively. Most of these straightforward application of RL scale poorly to more complex multi-agent (MA) learning problems, because the state space for each RL agent grows exponentially with the number of its partner agents engaged in the joint task. To remedy the exponentially large state space in multi-agent RL (MARL), we previously proposed a modular approach and demonstrated its effectiveness through the application to the MA learning problems. The results obtained by modular approach to MARL are encouraging, but it still has serious problems. The approach supposes: (i) all the sensory inputs and action outputs for an agent are discrete values, and (ii) all the agents make their decisions totally synchronously at regular time intervals, while such assumption does not hold in real-world multi-agent environments in general. We propose yet another MARL framework which can overcome the state space explosion in MARL, based on neural network representation of the decision policy for an agent and its optimization with a real-coded GA, which is applicable to multi-agent domains where individual agents are allowed to receive and output discrete/continuous values and to make their decisions asynchronously. To show the effectiveness of the proposed framework for real-world MARL, we have applied it to the asynchronous multi-agent seesaw balancing problem and the dynamic channel allocation problem in cellular telephone systems. The results are quite encouraging, while those problems can not be solved appropriately using any other conventional MARL frameworks.

Report

(3 results)

2001 Annual Research Report Final Research Report Summary
2000 Annual Research Report

Research Products
(24 results)

All Other

All Publications (24 results)

[Publications] Isao Ono: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents"Proc.2000 Genetic and Evolutionary Conference (GECCO 2000). 203-210 (2000)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] Isao Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA using Unimodal Normal Distribution Crossover"Proc.2000 Congress on Evolutionary Computation (CEC2000). 659-666 (2000)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] Yorikazu Takao: "Constructing Approximation Models Based on Agent-Based Simulations by Genetic Algorithms"Proc.Fourth International Conference on Computational Intelligence and multimedia Applications. 231-235 (2001)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] 山元隆行: "非同期型マルチエージント教化学習への進化的接近"計測自動制御学会第28回知能システムシンポジウム資料. 21-26 (2001)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] 中原利和: "ニューラルネット表現を用いたサッカーエージェントの行動政策の自動獲得"第45回システム制御情報学会研究発表講演会論文集. 65-66 (2001)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] 問口将行: "対戦型ゲームにおける行動政策の共進化的獲得のための世代交代モデル"第46回システム制御情報学会研究発表講演会論文集. (in press). (2002)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] Isao Ono, Tetsuo Nijo and Norihiko Ono: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents"Proc. 2000 Genetic and Evolutionary Conference )GECCO2000). 203-210 (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] Isao Ono, Miyuki Takahashi and Norihiko Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA Using Unimodal Normal Distribution Crossover"Proc. 2000 Congress on Evolutionary Computation )CEC2000). 659-666 (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] Yorikazu Takao, Isao Ono and Norihiko Ono: "Constructing Approximation Models Based on Agent-Based Simulations by Genetic Algorithms"Proc. Fourth International Conference on Computational Intelligence and Multimedia Applications. 231-235 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] Takayuki Yamamoto, Yoko Nakanishi, Isao Ono and Norihiko Ono: "Optimization of Asynchronous Multi-agent Systems with Real-Coded uenetic Algorithms )in Japanese)"Proc. 28th SICE Symposium on Intelligent System. 21-26 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] Toshikazu Nakahara, Masayuki Maguchi, Isao Ono and Norihiko Ono: "Evolutionary Acquisition of Policies for Soccer Agents with Neural Networks )in Japanese)"Proc. 45th Annual Conference of the Institute of Systems, Control and Information Engineers (ISCIE). 65-66 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] Masayuki Maguchi, Norihiko Ono and Isao Ono: "On Co-Evolutionary Acquisition of Effective Policies in Two-Player Competitive Games )in Japanese)"Proc. 46th Annual Conference of the Institute of Systems, Control and Information Engineers )ISCIE). (in press). (2002)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2001 Final Research Report Summary
[Publications] 高橋みゆき: "ニューラルネットエージェントと例題の共進化"第45回システム制御情報学会研究発表講演会論文集. 61-62 (2001)
- Related Report
  2001 Annual Research Report
[Publications] 中原利和: "ニューラルネット表現を用いたサッカーエージェントの行動政策の進化的獲得"第45回システム制御情報学会研究発表講演会論文集. 63-64 (2001)
- Related Report
  2001 Annual Research Report
[Publications] 間口将行: "対戦型ゲームにおける行動政策の共進化的獲得"第45回システム制御情報学会研究発表講演会論文集. 15-16 (2001)
- Related Report
  2001 Annual Research Report
[Publications] 山元隆行: "非同期型マルチエージェント系の進化的設計"第45回システム制御情報学会研究発表講演会論文集. 19-20 (2001)
- Related Report
  2001 Annual Research Report
[Publications] 道辻壮哉: "進化型ニューラルネットによるサッカーエジェントの創発的設計"第46回システム制御情報学会研究発表講演会講演論文集. (in press). (2002)
- Related Report
  2001 Annual Research Report
[Publications] 間口将行: "対戦型ゲームにおける行動政策の共進化的獲得のための世代交代モデル"第46回システム制御情報学会研究発表講演会講演論文集. (in press). (2002)
- Related Report
  2001 Annual Research Report
[Publications] Isao ONO,: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents"Proceedings of the 2000 Genetic and Evolutionary Conference. 203-210 (2000)
- Related Report
  2000 Annual Research Report
[Publications] Isao ONO,: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA using Unimodal Normal Distribution Crossover"Proceedings of the 2000 Congress on Evolutionary Computation. 659-666 (2000)
- Related Report
  2000 Annual Research Report
[Publications] 山元隆行,: "非同期型マルチエージェント強化学習問題への進化的接近"計測自動制御学会第28回知能システムシンポジウム資料. (2001)
- Related Report
  2000 Annual Research Report
[Publications] 中原利和,: "ニューラルネット表現を用いたサッカーエージェントの行動政策の進化的獲得"第45回システム制御情報学会研究発表講演会講演論文集. (2001)
- Related Report
  2000 Annual Research Report
[Publications] 山下裕志,: "異種エージェントによる対戦型ゲーム政策の共進化的獲得に関する実験的考察"第45回システム制御情報学会研究発表講演会講演論文集. (2001)
- Related Report
  2000 Annual Research Report
[Publications] 高橋みゆき,: "ニューラルネットエージェントと例題の共進化"第45回システム制御情報学会研究発表講演会講演論文集. (2001)
- Related Report
  2000 Annual Research Report

Multi-agent Reinforcement Learning Based on Compressed Representation of Decision Policies

Principal Investigator

ONO Norihiko The University of Tokushima, Faculty of Engineering, Professor, 工学部, 教授 (60194594)

¥3,600,000 (Direct Cost: ¥3,600,000)

Report

Research Products

[Publications] Isao Ono: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents"Proc.2000 Genetic and Evolutionary Conference (GECCO 2000). 203-210 (2000)

Description

Related Report

[Publications] Isao Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA using Unimodal Normal Distribution Crossover"Proc.2000 Congress on Evolutionary Computation (CEC2000). 659-666 (2000)

Description

Related Report

[Publications] Yorikazu Takao: "Constructing Approximation Models Based on Agent-Based Simulations by Genetic Algorithms"Proc.Fourth International Conference on Computational Intelligence and multimedia Applications. 231-235 (2001)

Description

Related Report

[Publications] 山元 隆行: "非同期型マルチエージント教化学習への進化的接近"計測自動制御学会第28回知能システムシンポジウム資料. 21-26 (2001)

Description

Related Report

[Publications] 中原 利和: "ニューラルネット表現を用いたサッカーエージェントの行動政策の自動獲得"第45回システム制御情報学会研究発表講演会論文集. 65-66 (2001)

Description

Related Report

[Publications] 問口 将行: "対戦型ゲームにおける行動政策の共進化的獲得のための世代交代モデル"第46回システム制御情報学会研究発表講演会論文集. (in press). (2002)

Description

Related Report

[Publications] Isao Ono, Tetsuo Nijo and Norihiko Ono: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents"Proc. 2000 Genetic and Evolutionary Conference )GECCO2000). 203-210 (2000)

Description

Related Report

[Publications] Isao Ono, Miyuki Takahashi and Norihiko Ono: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA Using Unimodal Normal Distribution Crossover"Proc. 2000 Congress on Evolutionary Computation )CEC2000). 659-666 (2000)

Description

Related Report

[Publications] Yorikazu Takao, Isao Ono and Norihiko Ono: "Constructing Approximation Models Based on Agent-Based Simulations by Genetic Algorithms"Proc. Fourth International Conference on Computational Intelligence and Multimedia Applications. 231-235 (2001)

Description

Related Report

[Publications] Takayuki Yamamoto, Yoko Nakanishi, Isao Ono and Norihiko Ono: "Optimization of Asynchronous Multi-agent Systems with Real-Coded uenetic Algorithms )in Japanese)"Proc. 28th SICE Symposium on Intelligent System. 21-26 (2001)

Description

Related Report

[Publications] Toshikazu Nakahara, Masayuki Maguchi, Isao Ono and Norihiko Ono: "Evolutionary Acquisition of Policies for Soccer Agents with Neural Networks )in Japanese)"Proc. 45th Annual Conference of the Institute of Systems, Control and Information Engineers (ISCIE). 65-66 (2001)

Description

Related Report

[Publications] Masayuki Maguchi, Norihiko Ono and Isao Ono: "On Co-Evolutionary Acquisition of Effective Policies in Two-Player Competitive Games )in Japanese)"Proc. 46th Annual Conference of the Institute of Systems, Control and Information Engineers )ISCIE). (in press). (2002)

Description

Related Report

[Publications] 高橋みゆき: "ニューラルネットエージェントと例題の共進化"第45回システム制御情報学会研究発表講演会論文集. 61-62 (2001)

Related Report

[Publications] 中原利和: "ニューラルネット表現を用いたサッカーエージェントの行動政策の進化的獲得"第45回システム制御情報学会研究発表講演会論文集. 63-64 (2001)

Related Report

[Publications] 間口将行: "対戦型ゲームにおける行動政策の共進化的獲得"第45回システム制御情報学会研究発表講演会論文集. 15-16 (2001)

Related Report

[Publications] 山元隆行: "非同期型マルチエージェント系の進化的設計"第45回システム制御情報学会研究発表講演会論文集. 19-20 (2001)

Related Report

[Publications] 道辻壮哉: "進化型ニューラルネットによるサッカーエジェントの創発的設計"第46回システム制御情報学会研究発表講演会講演論文集. (in press). (2002)

Related Report

[Publications] 間口将行: "対戦型ゲームにおける行動政策の共進化的獲得のための世代交代モデル"第46回システム制御情報学会研究発表講演会講演論文集. (in press). (2002)

Related Report

[Publications] Isao ONO,: "A Genetic Algorithm for Automatically Designing Modular Reinforcement Learning Agents"Proceedings of the 2000 Genetic and Evolutionary Conference. 203-210 (2000)

Related Report

[Publications] Isao ONO,: "Evolving Neural Networks in Environments with Delayed Rewards by A Real-Coded GA using Unimodal Normal Distribution Crossover"Proceedings of the 2000 Congress on Evolutionary Computation. 659-666 (2000)

Related Report

[Publications] 山元隆行,: "非同期型マルチエージェント強化学習問題への進化的接近"計測自動制御学会第28回知能システムシンポジウム資料. (2001)

Related Report

[Publications] 中原利和,: "ニューラルネット表現を用いたサッカーエージェントの行動政策の進化的獲得"第45回システム制御情報学会研究発表講演会講演論文集. (2001)

Related Report

[Publications] 山下裕志,: "異種エージェントによる対戦型ゲーム政策の共進化的獲得に関する実験的考察"第45回システム制御情報学会研究発表講演会講演論文集. (2001)

Related Report

[Publications] 高橋みゆき,: "ニューラルネットエージェントと例題の共進化"第45回システム制御情報学会研究発表講演会講演論文集. (2001)

Related Report

[Publications] 山元隆行: "非同期型マルチエージント教化学習への進化的接近"計測自動制御学会第28回知能システムシンポジウム資料. 21-26 (2001)

[Publications] 中原利和: "ニューラルネット表現を用いたサッカーエージェントの行動政策の自動獲得"第45回システム制御情報学会研究発表講演会論文集. 65-66 (2001)

[Publications] 問口将行: "対戦型ゲームにおける行動政策の共進化的獲得のための世代交代モデル"第46回システム制御情報学会研究発表講演会論文集. (in press). (2002)