2003 Fiscal Year Final Research Report Summary

Fundamental Research on Advanced Evolutionary and Adaptive Systems

Research Project

Project/Area Number	13480089
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	Tokyo Institute of Technology
Principal Investigator	KOBAYASHI Shigenobu Tokyo Institute of Technology, Interdisciplinary School of Science and Technology, Professor, 大学院・総合理工学研究科, 教授 (40016697)
Co-Investigator(Kenkyū-buntansha)	KIMURA Hajime Tokyo Institute of Technology, Interdisciplinary School of Science and Technology, Research Associate, 大学院・総合理工学研究科, 助手 (40302963)
Project Period (FY)	2001 – 2003
Keywords	Evolutionary Computation / Real-coded Genetic Algorithms / UV Structure / k-tablet Structure / Reinforcement Learning / Actor-Critic / four-legged Walking Robot / Distributed Reinforcement Learning
Research Abstract	A.Research Results on Evolutionary Computation ・We proposed a robust real-coded GA using the combination of two crossovers, UNDX-m and EDX. It can deal both ridge-structure function whose dimension reaches more than hundreds and multi-peak functions. ・We proposed a new evolutionary algorithm called ANS (Adaptive neighboring Search) with a crossover-like mutation to optimize high dimensional deceptive multimodal functions. ・We found a hypothesis call "UV-phenomenon" which explains failures of global search by GA. It suggests UV-structures as hard landscape structures that will cause the UV-phenomenon. B.Research Results on Reinforcement Learning ・We proposed a new crossover LUNDX-m which uses only in-dimensional latent variables. LUNDX-m can treat with high-dimensional ill-scaled structures called k -tablet structure. ・In multi-agent reinforcement learning systems, it is important how to share a reward among all agents. We derived the necessary and sufficient condition to preserve the rationality to realize cooperative behaviors. ・We proposed a policy function representation that consists of a stochastic binary decision tree. We applied it to an actor-critic algorithm for the problems that have enormous similar actions. ・We investigated a reinforcement learning of walking behavior for a four-legged robot. We presented. a new actor-critic algorithm, in which the actor selects a continuous action from its bounded action space by using the normal distribution. ・We proposed a new simulation-based distributed reinforcement learning approach that solves large planning problems under uncertain environment. We applied it to real sewerage control systems.

Research Products
(18 results)

All Other

All Publications (18 results)

[Publications] 木村元, 小林重信: "確率的2分木の行動選択を用いたActor-Criticアルゴリズム:多数の行動を扱う強化学習"計測自動制御学会論文誌. 37. 1147-1155 (2001)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 池田心, 小林重信: "GAの探索におけるUV現象とUV構造仮説"人工知能学会論文誌. 17. 239-246 (2002)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 木村元, 荒牧岳志, 小林重信: "重み付けされた複数の正規分布を用いた政策表現-最適行動変化に追従できる実時間学習と環状ロボットへの適用"人工知能学会論文誌. 18. 316-324 (2003)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 青木圭, 木村元, 長岩明弘, 小林重信: "分散強化学習による下水送水系の制御"電気学会論文誌D. 123. 462-469 (2003)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 佐久間淳, 小林重信: "高次元k-tablet構造を考慮した実数値GA〜隠れ変数上の交叉LUNDX-mの提案と評価"人工知能学会論文誌. 19. 28-37 (2004)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Kamiya, A., Kawai, K., Ono, I., Kobayashi, K.: "Theoretical Proof of Edge Search Strategy Applied to Power Plant Star-up Scheduling"IEEE Trans. On Systems, Man and Cybernetics -Part B : Cybernetics. Vol.32,No.3. 316-331 (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Kimura, H., Yamashita, T., Kobayashi, S.: "Reinforcement Learning of Walking Behavior for a Four-Legged Robot"40th IEEE Conference on Decision and Control (CDC2001). 411-416 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Sato, M., Kobayashi, S.: "Average-Reward Reinforcement Learning for Variance Penalized Markov Decision Problems"Proc. of the 18th Int. Conf. on Machine Learning. 473-480 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Takahashi, O., Kobayashi, S.: "An Adaptive Neighboring Search Using Crossover-Like Mutation For Multimodal Function Optimization"IEEE International Conference on Systems, Man and Cybernetics. 261-267 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Miyazaki, K., Kobayashi, S.: "On the Rationality of Profit Sharing in Multi-agent Reinforcement Learning"Proc. of the 4th Int. Conf. on Computational Intelligence and Multimedia Applications. 123-127 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Miyazaki, K., Tsuboi, S., Kobayashi, S.: "Reinforcement Learning for Penalty Avoiding Policy Making"Proc. of the 7th Int. Conf. on Information Systems Analysis and Cynthesis. Vol 3. 40-44 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Ikeda, K., Kita, H., Kobayashi, S.: "Failure of Pareto-Based MOEAS : Does Non-Dominated Really Mean Near to Optimal?"Proc. of Congress on Evolutionary Computation. 957-962 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Sakuma, J., Kobayashi, S.: "Extrapolation-Directed Crossover for Real -coded GA : Overcoming Deceptive Phenomena by Extrapolative Search"Proc. of Congress on Evolutionary Computation. 655-662 (2001)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Ikeda, K., Kobayashi, S.: "Deterministic Multi-step Crossover Fusion: A handy Crossover Composition for GAs"Proc. of 7th Parallel Problem Solving from Nature. 162-171 (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Sakuma, J., Kobayashi, S.: "Non-parametric Expectation-Maximization for Gaussian Mixture"Proc. of 9th Int. Conf. on Neural Information Processing. (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Sakuma, J., Kobayashi, S.: "k-tablet Structure and Crossover on Latent Variables for Real-coded GA ; Proc"Proc. Of Int. Conf. on Genetic Algorithms. 404-411 (2002)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Kamiya, A., Kato, M., Shimada, K., Kobayashi, S.: "Fusion of Soft Computing and Hard Computing Techniques for Power Energy System"IEEE International Symposium on Computational Intelligence in Robotics and Automation, 2003. (2003)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Miyazaki, K., Kobayashi, S.: "Rationality of Reward Sharing in Multi-agent Reinforcement Learning"Journal of New Generation Computing. Vol.91 No.2. 157-172 (2001)
- Description
  「研究成果報告書概要(欧文)」より

2003 Fiscal Year Final Research Report Summary

Fundamental Research on Advanced Evolutionary and Adaptive Systems

Principal Investigator

KOBAYASHI Shigenobu Tokyo Institute of Technology, Interdisciplinary School of Science and Technology, Professor, 大学院・総合理工学研究科, 教授 (40016697)

Research Products

[Publications] 木村元, 小林重信: "確率的2分木の行動選択を用いたActor-Criticアルゴリズム:多数の行動を扱う強化学習"計測自動制御学会論文誌. 37. 1147-1155 (2001)

Description

[Publications] 池田心, 小林重信: "GAの探索におけるUV現象とUV構造仮説"人工知能学会論文誌. 17. 239-246 (2002)

Description

[Publications] 木村元, 荒牧岳志, 小林重信: "重み付けされた複数の正規分布を用いた政策表現-最適行動変化に追従できる実時間学習と環状ロボットへの適用"人工知能学会論文誌. 18. 316-324 (2003)

Description

[Publications] 青木圭, 木村元, 長岩明弘, 小林重信: "分散強化学習による下水送水系の制御"電気学会論文誌D. 123. 462-469 (2003)

Description

[Publications] 佐久間淳, 小林重信: "高次元k-tablet構造を考慮した実数値GA〜隠れ変数上の交叉LUNDX-mの提案と評価"人工知能学会論文誌. 19. 28-37 (2004)

Description

[Publications] Kamiya, A., Kawai, K., Ono, I., Kobayashi, K.: "Theoretical Proof of Edge Search Strategy Applied to Power Plant Star-up Scheduling"IEEE Trans. On Systems, Man and Cybernetics -Part B : Cybernetics. Vol.32,No.3. 316-331 (2002)

Description

[Publications] Kimura, H., Yamashita, T., Kobayashi, S.: "Reinforcement Learning of Walking Behavior for a Four-Legged Robot"40th IEEE Conference on Decision and Control (CDC2001). 411-416 (2001)

Description

[Publications] Sato, M., Kobayashi, S.: "Average-Reward Reinforcement Learning for Variance Penalized Markov Decision Problems"Proc. of the 18th Int. Conf. on Machine Learning. 473-480 (2001)

Description

[Publications] Takahashi, O., Kobayashi, S.: "An Adaptive Neighboring Search Using Crossover-Like Mutation For Multimodal Function Optimization"IEEE International Conference on Systems, Man and Cybernetics. 261-267 (2001)

Description

[Publications] Miyazaki, K., Kobayashi, S.: "On the Rationality of Profit Sharing in Multi-agent Reinforcement Learning"Proc. of the 4th Int. Conf. on Computational Intelligence and Multimedia Applications. 123-127 (2001)

Description

[Publications] Miyazaki, K., Tsuboi, S., Kobayashi, S.: "Reinforcement Learning for Penalty Avoiding Policy Making"Proc. of the 7th Int. Conf. on Information Systems Analysis and Cynthesis. Vol 3. 40-44 (2001)

Description

[Publications] Ikeda, K., Kita, H., Kobayashi, S.: "Failure of Pareto-Based MOEAS : Does Non-Dominated Really Mean Near to Optimal?"Proc. of Congress on Evolutionary Computation. 957-962 (2001)

Description

[Publications] Sakuma, J., Kobayashi, S.: "Extrapolation-Directed Crossover for Real -coded GA : Overcoming Deceptive Phenomena by Extrapolative Search"Proc. of Congress on Evolutionary Computation. 655-662 (2001)

Description

[Publications] Ikeda, K., Kobayashi, S.: "Deterministic Multi-step Crossover Fusion: A handy Crossover Composition for GAs"Proc. of 7th Parallel Problem Solving from Nature. 162-171 (2002)

Description

[Publications] Sakuma, J., Kobayashi, S.: "Non-parametric Expectation-Maximization for Gaussian Mixture"Proc. of 9th Int. Conf. on Neural Information Processing. (2002)

Description

[Publications] Sakuma, J., Kobayashi, S.: "k-tablet Structure and Crossover on Latent Variables for Real-coded GA ; Proc"Proc. Of Int. Conf. on Genetic Algorithms. 404-411 (2002)

Description

[Publications] Kamiya, A., Kato, M., Shimada, K., Kobayashi, S.: "Fusion of Soft Computing and Hard Computing Techniques for Power Energy System"IEEE International Symposium on Computational Intelligence in Robotics and Automation, 2003. (2003)

Description

[Publications] Miyazaki, K., Kobayashi, S.: "Rationality of Reward Sharing in Multi-agent Reinforcement Learning"Journal of New Generation Computing. Vol.91 No.2. 157-172 (2001)

Description