1996 Fiscal Year Annual Research Report

複数戦略型マルチエージェントシステムとその学習法の構築

Research Project

Project/Area Number	07680376
Research Institution	HOKKAIDO UNIVERSITY
Principal Investigator	三上貞芳北海道大学, 工学部, 助教授 (50229655)
Co-Investigator(Kenkyū-buntansha)	鈴木恵二北海道大学, 工学部, 助教授 (10250482) 嘉数侑昇北海道大学, 工学部, 教授 (60042090)
Keywords	強化学習 / マルチエージェント / 協調動作 / 競合・協調
Research Abstract	本研究は、複数の未知要因が相互に影響を与え合う環境に対するプランニング手法を提案することを目的として、マルチエージェントシステムが試行錯誤的学習を通じて機能分化を行い、独立した戦略を相補的に獲得しながら協調する問題解決手法による、複数戦略型アプローチを提案し、その学習機構の実現をめざしたものであり、当該期間に得られた成果は以下のようにまとめられる。 1.単一の学習エージェントの内部に複数戦略を導入する方法と、自律した複数エージェントによる群問題解決の2つの方向性を検討した。まず単一学習エージェントに対しては、状態入力と評価関数を独立させた学習サブエージェントを内包させること、個々のサブエージェントの成功頻度と状態観測の関数からなる利用度関数を新たに定義すること、この利用度関数とエージェント利用の短期遍歴、行動の選択を報酬、状態、行動として学習するような内部調停エージェントを持たせること、の3つの要素により通常の強化学習の枠組みを、複数戦略獲得問題へ適応させることが可能なことを明らかにした。これをアルゴリズムとしてまとめ、多体バランシング問題へ適用して有効性を確かめた。 2.群問題解決によるアプローチに対しては、まず自律学習エージェントそれぞれに知識の機能分化を生じさせる圧力として、強化学習の参照情報の最小の形である報酬値に操作量を加える方法が妥当であることを明らかにした。操作量の生成手法として局所報酬信号の時空間方向の混合関数という量を開発して導入した。環境に対する達成目標の種類が強調型、競争型の2種類に大別されることに対応して、時空間混合関数を平滑型、協調型に設定することにより、学習が干渉することなく、機能分化が進み全体目標の達成が可能になることを明らかにした。これをアルゴリズムとしてまとめ、多体衝突回避ルール生成問題へ適用して有効性を確認した。

Research Products
(6 results)

All Other

All Publications (6 results)

[Publications] Tomoyoshi Nakayama: "A Realization of Socially Adaptive Robots by Competitive Reinforcement Learning" 5th IEEE International Workshop on Robot and Human Communication. 107-111 (1996)
[Publications] Sadayoshi Mikami: "Combining Reinforcement Learning with GA to Find Co-ordinated Control Rules for Multi-Agent System" 1996 IEEE International Conference on Evolutionary Computation. 231-236 (1996)
[Publications] Sadayoshi Mikami: "Acquiring Co-operation wothout Communication by Reinforcement Learning and Dynamics Identification" Distributed Autonomous Robotic Systems 2. 439-439 (1996)
[Publications] Mitsuo Wada: "An Approach to Chaos and Self-Organizing Bebaviors in Symbiotic Relationships between Human and Robots" Joural of Robotics and Mechatronics. 8.4. 318-322 (1996)
[Publications] Sadayoshi Mikami: "Distributed GA for Evolving Co-operation of Autonomous Agents" Singapore International Conference for Information Computation and Instrumentotion. 57-62 (1995)
[Publications] Sadayoshi Mikami: "Broadcast Based Fitness Sharing GA for Conflict Resolution Among Autonomous Robots" Evolutionary Computing 2. 40-48 (1995)

1996 Fiscal Year Annual Research Report

複数戦略型マルチエージェントシステムとその学習法の構築

Principal Investigator

三上 貞芳 北海道大学, 工学部, 助教授 (50229655)

Research Products

[Publications] Tomoyoshi Nakayama: "A Realization of Socially Adaptive Robots by Competitive Reinforcement Learning" 5th IEEE International Workshop on Robot and Human Communication. 107-111 (1996)

[Publications] Sadayoshi Mikami: "Combining Reinforcement Learning with GA to Find Co-ordinated Control Rules for Multi-Agent System" 1996 IEEE International Conference on Evolutionary Computation. 231-236 (1996)

[Publications] Sadayoshi Mikami: "Acquiring Co-operation wothout Communication by Reinforcement Learning and Dynamics Identification" Distributed Autonomous Robotic Systems 2. 439-439 (1996)

[Publications] Mitsuo Wada: "An Approach to Chaos and Self-Organizing Bebaviors in Symbiotic Relationships between Human and Robots" Joural of Robotics and Mechatronics. 8.4. 318-322 (1996)

[Publications] Sadayoshi Mikami: "Distributed GA for Evolving Co-operation of Autonomous Agents" Singapore International Conference for Information Computation and Instrumentotion. 57-62 (1995)

[Publications] Sadayoshi Mikami: "Broadcast Based Fitness Sharing GA for Conflict Resolution Among Autonomous Robots" Evolutionary Computing 2. 40-48 (1995)

三上貞芳北海道大学, 工学部, 助教授 (50229655)