Project/Area Number |
13480089
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Tokyo Institute of Technology |
Principal Investigator |
KOBAYASHI Shigenobu Tokyo Institute of Technology, Interdisciplinary School of Science and Technology, Professor, 大学院・総合理工学研究科, 教授 (40016697)
|
Co-Investigator(Kenkyū-buntansha) |
KIMURA Hajime Tokyo Institute of Technology, Interdisciplinary School of Science and Technology, Research Associate, 大学院・総合理工学研究科, 助手 (40302963)
|
Project Period (FY) |
2001 – 2003
|
Project Status |
Completed (Fiscal Year 2003)
|
Budget Amount *help |
¥13,600,000 (Direct Cost: ¥13,600,000)
Fiscal Year 2003: ¥2,900,000 (Direct Cost: ¥2,900,000)
Fiscal Year 2002: ¥2,900,000 (Direct Cost: ¥2,900,000)
Fiscal Year 2001: ¥7,800,000 (Direct Cost: ¥7,800,000)
|
Keywords | Evolutionary Computation / Real-coded Genetic Algorithms / UV Structure / k-tablet Structure / Reinforcement Learning / Actor-Critic / four-legged Walking Robot / Distributed Reinforcement Learning / α-domination戦略 / Profit Sharing / 粉末X線解析 / 蛋白質構造決定 / UV構造仮説 / actor-critic / マルチエージェントシステム / 進化システム / 遺伝的アルゴリズム / 多目的GA / 適応システム / マルチエージェント強化学習 / 報酬共有の合理性 |
Research Abstract |
A.Research Results on Evolutionary Computation ・We proposed a robust real-coded GA using the combination of two crossovers, UNDX-m and EDX. It can deal both ridge-structure function whose dimension reaches more than hundreds and multi-peak functions. ・We proposed a new evolutionary algorithm called ANS (Adaptive neighboring Search) with a crossover-like mutation to optimize high dimensional deceptive multimodal functions. ・We found a hypothesis call "UV-phenomenon" which explains failures of global search by GA. It suggests UV-structures as hard landscape structures that will cause the UV-phenomenon. B.Research Results on Reinforcement Learning ・We proposed a new crossover LUNDX-m which uses only in-dimensional latent variables. LUNDX-m can treat with high-dimensional ill-scaled structures called k -tablet structure. ・In multi-agent reinforcement learning systems, it is important how to share a reward among all agents. We derived the necessary and sufficient condition to preserve the rationality to realize cooperative behaviors. ・We proposed a policy function representation that consists of a stochastic binary decision tree. We applied it to an actor-critic algorithm for the problems that have enormous similar actions. ・We investigated a reinforcement learning of walking behavior for a four-legged robot. We presented. a new actor-critic algorithm, in which the actor selects a continuous action from its bounded action space by using the normal distribution. ・We proposed a new simulation-based distributed reinforcement learning approach that solves large planning problems under uncertain environment. We applied it to real sewerage control systems.
|