Fundamental Research on Advanced Evolutionary and Adaptive Systems

Research Project

Project/Area Number	13480089
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	Tokyo Institute of Technology
Principal Investigator	KOBAYASHI Shigenobu Tokyo Institute of Technology, Interdisciplinary School of Science and Technology, Professor, 大学院・総合理工学研究科, 教授 (40016697)
Co-Investigator(Kenkyū-buntansha)	KIMURA Hajime Tokyo Institute of Technology, Interdisciplinary School of Science and Technology, Research Associate, 大学院・総合理工学研究科, 助手 (40302963)
Project Period (FY)	2001 – 2003
Project Status	Completed (Fiscal Year 2003)
Budget Amount *help	¥13,600,000 (Direct Cost: ¥13,600,000) Fiscal Year 2003: ¥2,900,000 (Direct Cost: ¥2,900,000) Fiscal Year 2002: ¥2,900,000 (Direct Cost: ¥2,900,000) Fiscal Year 2001: ¥7,800,000 (Direct Cost: ¥7,800,000)
Keywords	Evolutionary Computation / Real-coded Genetic Algorithms / UV Structure / k-tablet Structure / Reinforcement Learning / Actor-Critic / four-legged Walking Robot / Distributed Reinforcement Learning / α-domination戦略 / Profit Sharing / 粉末X線解析 / 蛋白質構造決定 / UV構造仮説 / actor-critic / マルチエージェントシステム / 進化システム / 遺伝的アルゴリズム / 多目的GA / 適応システム / マルチエージェント強化学習 / 報酬共有の合理性
Research Abstract	A.Research Results on Evolutionary Computation ・We proposed a robust real-coded GA using the combination of two crossovers, UNDX-m and EDX. It can deal both ridge-structure function whose dimension reaches more than hundreds and multi-peak functions. ・We proposed a new evolutionary algorithm called ANS (Adaptive neighboring Search) with a crossover-like mutation to optimize high dimensional deceptive multimodal functions. ・We found a hypothesis call "UV-phenomenon" which explains failures of global search by GA. It suggests UV-structures as hard landscape structures that will cause the UV-phenomenon. B.Research Results on Reinforcement Learning ・We proposed a new crossover LUNDX-m which uses only in-dimensional latent variables. LUNDX-m can treat with high-dimensional ill-scaled structures called k -tablet structure. ・In multi-agent reinforcement learning systems, it is important how to share a reward among all agents. We derived the necessary and sufficient condition to preserve the rationality to realize cooperative behaviors. ・We proposed a policy function representation that consists of a stochastic binary decision tree. We applied it to an actor-critic algorithm for the problems that have enormous similar actions. ・We investigated a reinforcement learning of walking behavior for a four-legged robot. We presented. a new actor-critic algorithm, in which the actor selects a continuous action from its bounded action space by using the normal distribution. ・We proposed a new simulation-based distributed reinforcement learning approach that solves large planning problems under uncertain environment. We applied it to real sewerage control systems.

Report

(4 results)

2003 Annual Research Report Final Research Report Summary
2002 Annual Research Report
2001 Annual Research Report

Research Products
(38 results)

All Other

All Publications (38 results)

[Publications] 木村元, 小林重信: "確率的2分木の行動選択を用いたActor-Criticアルゴリズム:多数の行動を扱う強化学習"計測自動制御学会論文誌. 37. 1147-1155 (2001)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] 池田心, 小林重信: "GAの探索におけるUV現象とUV構造仮説"人工知能学会論文誌. 17. 239-246 (2002)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] 木村元, 荒牧岳志, 小林重信: "重み付けされた複数の正規分布を用いた政策表現-最適行動変化に追従できる実時間学習と環状ロボットへの適用"人工知能学会論文誌. 18. 316-324 (2003)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] 青木圭, 木村元, 長岩明弘, 小林重信: "分散強化学習による下水送水系の制御"電気学会論文誌D. 123. 462-469 (2003)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] 佐久間淳, 小林重信: "高次元k-tablet構造を考慮した実数値GA〜隠れ変数上の交叉LUNDX-mの提案と評価"人工知能学会論文誌. 19. 28-37 (2004)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] Kamiya, A., Kawai, K., Ono, I., Kobayashi, K.: "Theoretical Proof of Edge Search Strategy Applied to Power Plant Star-up Scheduling"IEEE Trans. On Systems, Man and Cybernetics -Part B : Cybernetics. Vol.32,No.3. 316-331 (2002)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] Kimura, H., Yamashita, T., Kobayashi, S.: "Reinforcement Learning of Walking Behavior for a Four-Legged Robot"40th IEEE Conference on Decision and Control (CDC2001). 411-416 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] Sato, M., Kobayashi, S.: "Average-Reward Reinforcement Learning for Variance Penalized Markov Decision Problems"Proc. of the 18th Int. Conf. on Machine Learning. 473-480 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] Takahashi, O., Kobayashi, S.: "An Adaptive Neighboring Search Using Crossover-Like Mutation For Multimodal Function Optimization"IEEE International Conference on Systems, Man and Cybernetics. 261-267 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] Miyazaki, K., Kobayashi, S.: "On the Rationality of Profit Sharing in Multi-agent Reinforcement Learning"Proc. of the 4th Int. Conf. on Computational Intelligence and Multimedia Applications. 123-127 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] Miyazaki, K., Tsuboi, S., Kobayashi, S.: "Reinforcement Learning for Penalty Avoiding Policy Making"Proc. of the 7th Int. Conf. on Information Systems Analysis and Cynthesis. Vol 3. 40-44 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] Ikeda, K., Kita, H., Kobayashi, S.: "Failure of Pareto-Based MOEAS : Does Non-Dominated Really Mean Near to Optimal?"Proc. of Congress on Evolutionary Computation. 957-962 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] Sakuma, J., Kobayashi, S.: "Extrapolation-Directed Crossover for Real -coded GA : Overcoming Deceptive Phenomena by Extrapolative Search"Proc. of Congress on Evolutionary Computation. 655-662 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] Ikeda, K., Kobayashi, S.: "Deterministic Multi-step Crossover Fusion: A handy Crossover Composition for GAs"Proc. of 7th Parallel Problem Solving from Nature. 162-171 (2002)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] Sakuma, J., Kobayashi, S.: "Non-parametric Expectation-Maximization for Gaussian Mixture"Proc. of 9th Int. Conf. on Neural Information Processing. (2002)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] Sakuma, J., Kobayashi, S.: "k-tablet Structure and Crossover on Latent Variables for Real-coded GA ; Proc"Proc. Of Int. Conf. on Genetic Algorithms. 404-411 (2002)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] Kamiya, A., Kato, M., Shimada, K., Kobayashi, S.: "Fusion of Soft Computing and Hard Computing Techniques for Power Energy System"IEEE International Symposium on Computational Intelligence in Robotics and Automation, 2003. (2003)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] Miyazaki, K., Kobayashi, S.: "Rationality of Reward Sharing in Multi-agent Reinforcement Learning"Journal of New Generation Computing. Vol.91 No.2. 157-172 (2001)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2003 Final Research Report Summary
[Publications] 木村元, 荒牧岳志, 小林重信: "重み付けされた複数の正規分布を用いた政策表現-最適行動変化に追従できる実時間強化学習と環状ロボットへの適用"人工知能学会論文誌. Vol.18・No.6. 316-324 (2003)
- Related Report
  2003 Annual Research Report
[Publications] 宮崎和光, 小林重信: "Profit Sharingの不完全知覚環境下への拡張:PS-r*の提案と評価"人工知能学会論文誌. Vol.18・No.5. 286-296 (2003)
- Related Report
  2003 Annual Research Report
[Publications] 青木圭, 木村元, 長岩明弘, 小林重信: "分散強化学習による下水送水系の制御"電気学会論文誌D. Vol.123・No.4. 462-469 (2003)
- Related Report
  2003 Annual Research Report
[Publications] Kamiya, A., Kato, M., Shimada, K., Kobayashi, S.: "Fusion of Soft Computing and Hard Computing Techniques for Power Energy System"IEEE International Symposium on Computational Intelligence in Robotics and Automation. (2003)
- Related Report
  2003 Annual Research Report
[Publications] Iguchi, K., Kobayashi, S.: "Determination of 3D structure from X-ray powder diffraction using GA"6th International Workshop on the Crystal Growth of Organic Materials. (2003)
- Related Report
  2003 Annual Research Report
[Publications] Takahashi, O., Kobayashi, S.: "Protein Model Building Using Image Processing Technique and Optimization Algorithms"Broom 2003 International Crystallography Meetings. (2003)
- Related Report
  2003 Annual Research Report
[Publications] 池田心, 小林重信: "GAの探索におけるUV現象とUV構造仮説"人工知能学会論文誌. Vol.17, No.3. 239-246 (2002)
- Related Report
  2002 Annual Research Report
[Publications] Sakuma, J., Kobayashi, S.: "k-tablet Structure and Crossover on Latent Variables for Real-coded GA"Proc. Int. Conf. on Genetic Algorithms. 404-411 (2002)
- Related Report
  2002 Annual Research Report
[Publications] Sakuma, J., Kobayashi, S.: "Non-parametric Expectation-Maximization for Gaussian Mixture"Proc. of 9^<th> Int. Conf. on Neural Information Processing. 517-522 (2002)
- Related Report
  2002 Annual Research Report
[Publications] 木村元, 山下透, 小林重信: "強化学習による4足ロボットの歩行動作の獲得"電気学会電子情報システム部門誌. Vol.122-C, No.3. 330-337 (2002)
- Related Report
  2002 Annual Research Report
[Publications] 宮崎和光, 坪井創吾, 小林重信: "罰回避政策形成アルゴリズムの改良とオセロゲームへの応用"人工知能学会論文誌. Vol.17,No.5. 548-556 (2002)
- Related Report
  2002 Annual Research Report
[Publications] 青木圭, 木村元, 小林重信: "協調型分散強化学習による上水道送水系の制御"第30回知能システムシンポジウム. 155-160 (2003)
- Related Report
  2002 Annual Research Report
[Publications] Hajime Kimura, Toru Yamashita, Shigenobu Kobayashi: "Reinforcement Learning of Walking Behavior for a Four-Legged Robot"40th IEEE Conference on Decision and Control. 411-416 (2001)
- Related Report
  2001 Annual Research Report
[Publications] Kazuteru Miyazaki, Shigenobu Kobayashi: "Rationality of Reward Sharing in Multi-agent Reinforcement Learning"Journal of New Generation Computing. Vol.91, No.2. 157-172 (2001)
- Related Report
  2001 Annual Research Report
[Publications] Kokoro Ikeda, Hajime Kita, Shigenobu Kobayashi: "Failure of Pareto-Based MOEAS : Does Non-Dominated Really Mean Near to Optimal?"Congress on Evolutionary Computation(CEC2001). 957-962 (2001)
- Related Report
  2001 Annual Research Report
[Publications] Jun Sakuma, Shigenobu Kobayashi: "Extrapolation-Directed Crossover for Real-coded GA : Overcoming Deceptive Phenomena by Extrapolative Search"Proc. of Congress on Evolutionary Computation. 655-662 (2001)
- Related Report
  2001 Annual Research Report
[Publications] 木村周平, 小野功, 喜多一, 小林重信: "交叉の設計指針に基づくUNDXの拡張:ENDXの提案と評価"計測自動制御学会論文誌. Vol.37, No.1. 1162-1171 (2001)
- Related Report
  2001 Annual Research Report
[Publications] 高橋治, 木村周平, 小林重信: "交叉的突然変異による適応的近傍探索-騙しのある多峰性関数の最適化-"人工知能学会誌. Vol.16, No.1. 175-184 (2001)
- Related Report
  2001 Annual Research Report
[Publications] 木村元, 小林重信: "確率的2分木の行動選択を用いたActor-Criticアルゴリズム:多数の行動を扱う強化学習"計測自動制御学会誌. Vol.37, No.12. 1147-1155 (2001)
- Related Report
  2001 Annual Research Report
[Publications] 佐藤誠, 木村元, 小林重信: "報酬の分散を推定するTDアルゴリズムとMean-Variance強化学習法の提案"人工知能学会誌. Vol.16, No.3-F. 353-362 (2001)
- Related Report
  2001 Annual Research Report

Fundamental Research on Advanced Evolutionary and Adaptive Systems

Principal Investigator

KOBAYASHI Shigenobu Tokyo Institute of Technology, Interdisciplinary School of Science and Technology, Professor, 大学院・総合理工学研究科, 教授 (40016697)

¥13,600,000 (Direct Cost: ¥13,600,000)

Report

Research Products

[Publications] 木村元, 小林重信: "確率的2分木の行動選択を用いたActor-Criticアルゴリズム:多数の行動を扱う強化学習"計測自動制御学会論文誌. 37. 1147-1155 (2001)

Description

Related Report

[Publications] 池田心, 小林重信: "GAの探索におけるUV現象とUV構造仮説"人工知能学会論文誌. 17. 239-246 (2002)

Description

Related Report

[Publications] 木村元, 荒牧岳志, 小林重信: "重み付けされた複数の正規分布を用いた政策表現-最適行動変化に追従できる実時間学習と環状ロボットへの適用"人工知能学会論文誌. 18. 316-324 (2003)

Description

Related Report

[Publications] 青木圭, 木村元, 長岩明弘, 小林重信: "分散強化学習による下水送水系の制御"電気学会論文誌D. 123. 462-469 (2003)

Description

Related Report

[Publications] 佐久間淳, 小林重信: "高次元k-tablet構造を考慮した実数値GA〜隠れ変数上の交叉LUNDX-mの提案と評価"人工知能学会論文誌. 19. 28-37 (2004)

Description

Related Report

[Publications] Kamiya, A., Kawai, K., Ono, I., Kobayashi, K.: "Theoretical Proof of Edge Search Strategy Applied to Power Plant Star-up Scheduling"IEEE Trans. On Systems, Man and Cybernetics -Part B : Cybernetics. Vol.32,No.3. 316-331 (2002)

Description

Related Report

[Publications] Kimura, H., Yamashita, T., Kobayashi, S.: "Reinforcement Learning of Walking Behavior for a Four-Legged Robot"40th IEEE Conference on Decision and Control (CDC2001). 411-416 (2001)

Description

Related Report

[Publications] Sato, M., Kobayashi, S.: "Average-Reward Reinforcement Learning for Variance Penalized Markov Decision Problems"Proc. of the 18th Int. Conf. on Machine Learning. 473-480 (2001)

Description

Related Report

[Publications] Takahashi, O., Kobayashi, S.: "An Adaptive Neighboring Search Using Crossover-Like Mutation For Multimodal Function Optimization"IEEE International Conference on Systems, Man and Cybernetics. 261-267 (2001)

Description

Related Report

[Publications] Miyazaki, K., Kobayashi, S.: "On the Rationality of Profit Sharing in Multi-agent Reinforcement Learning"Proc. of the 4th Int. Conf. on Computational Intelligence and Multimedia Applications. 123-127 (2001)

Description

Related Report

[Publications] Miyazaki, K., Tsuboi, S., Kobayashi, S.: "Reinforcement Learning for Penalty Avoiding Policy Making"Proc. of the 7th Int. Conf. on Information Systems Analysis and Cynthesis. Vol 3. 40-44 (2001)

Description

Related Report

[Publications] Ikeda, K., Kita, H., Kobayashi, S.: "Failure of Pareto-Based MOEAS : Does Non-Dominated Really Mean Near to Optimal?"Proc. of Congress on Evolutionary Computation. 957-962 (2001)

Description

Related Report

[Publications] Sakuma, J., Kobayashi, S.: "Extrapolation-Directed Crossover for Real -coded GA : Overcoming Deceptive Phenomena by Extrapolative Search"Proc. of Congress on Evolutionary Computation. 655-662 (2001)

Description

Related Report

[Publications] Ikeda, K., Kobayashi, S.: "Deterministic Multi-step Crossover Fusion: A handy Crossover Composition for GAs"Proc. of 7th Parallel Problem Solving from Nature. 162-171 (2002)

Description

Related Report

[Publications] Sakuma, J., Kobayashi, S.: "Non-parametric Expectation-Maximization for Gaussian Mixture"Proc. of 9th Int. Conf. on Neural Information Processing. (2002)

Description

Related Report

[Publications] Sakuma, J., Kobayashi, S.: "k-tablet Structure and Crossover on Latent Variables for Real-coded GA ; Proc"Proc. Of Int. Conf. on Genetic Algorithms. 404-411 (2002)

Description

Related Report

[Publications] Kamiya, A., Kato, M., Shimada, K., Kobayashi, S.: "Fusion of Soft Computing and Hard Computing Techniques for Power Energy System"IEEE International Symposium on Computational Intelligence in Robotics and Automation, 2003. (2003)

Description

Related Report

[Publications] Miyazaki, K., Kobayashi, S.: "Rationality of Reward Sharing in Multi-agent Reinforcement Learning"Journal of New Generation Computing. Vol.91 No.2. 157-172 (2001)

Description

Related Report

[Publications] 木村元, 荒牧岳志, 小林重信: "重み付けされた複数の正規分布を用いた政策表現-最適行動変化に追従できる実時間強化学習と環状ロボットへの適用"人工知能学会論文誌. Vol.18・No.6. 316-324 (2003)

Related Report

[Publications] 宮崎和光, 小林重信: "Profit Sharingの不完全知覚環境下への拡張:PS-r*の提案と評価"人工知能学会論文誌. Vol.18・No.5. 286-296 (2003)

Related Report

[Publications] 青木圭, 木村元, 長岩明弘, 小林重信: "分散強化学習による下水送水系の制御"電気学会論文誌D. Vol.123・No.4. 462-469 (2003)

Related Report

[Publications] Kamiya, A., Kato, M., Shimada, K., Kobayashi, S.: "Fusion of Soft Computing and Hard Computing Techniques for Power Energy System"IEEE International Symposium on Computational Intelligence in Robotics and Automation. (2003)

Related Report

[Publications] Iguchi, K., Kobayashi, S.: "Determination of 3D structure from X-ray powder diffraction using GA"6th International Workshop on the Crystal Growth of Organic Materials. (2003)

Related Report

[Publications] Takahashi, O., Kobayashi, S.: "Protein Model Building Using Image Processing Technique and Optimization Algorithms"Broom 2003 International Crystallography Meetings. (2003)

Related Report

[Publications] 池田心, 小林重信: "GAの探索におけるUV現象とUV構造仮説"人工知能学会論文誌. Vol.17, No.3. 239-246 (2002)

Related Report

[Publications] Sakuma, J., Kobayashi, S.: "k-tablet Structure and Crossover on Latent Variables for Real-coded GA"Proc. Int. Conf. on Genetic Algorithms. 404-411 (2002)

Related Report

[Publications] Sakuma, J., Kobayashi, S.: "Non-parametric Expectation-Maximization for Gaussian Mixture"Proc. of 9^<th> Int. Conf. on Neural Information Processing. 517-522 (2002)

Related Report

[Publications] 木村元, 山下透, 小林重信: "強化学習による4足ロボットの歩行動作の獲得"電気学会電子情報システム部門誌. Vol.122-C, No.3. 330-337 (2002)

Related Report

[Publications] 木村周平, 小野功, 喜多一, 小林重信: "交叉の設計指針に基づくUNDXの拡張:ENDXの提案と評価"計測自動制御学会論文誌. Vol.37, No.1. 1162-1171 (2001)

[Publications] 木村元, 小林重信: "確率的2分木の行動選択を用いたActor-Criticアルゴリズム:多数の行動を扱う強化学習"計測自動制御学会誌. Vol.37, No.12. 1147-1155 (2001)

[Publications] 佐藤誠, 木村元, 小林重信: "報酬の分散を推定するTDアルゴリズムとMean-Variance強化学習法の提案"人工知能学会誌. Vol.16, No.3-F. 353-362 (2001)