Project/Area Number |
21300113
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Bioinformatics/Life informatics
|
Research Institution | Kyoto University |
Principal Investigator |
ISHII Shin 京都大学, 大学院・情報学研究科, 教授 (90294280)
|
Co-Investigator(Kenkyū-buntansha) |
NAKAMURA Yutaka 大阪大学, 基礎工学研究科, 助教 (70403334)
MAEDA Shinichi 京都大学, 大学院・情報学研究科, 助教 (20379530)
|
Co-Investigator(Renkei-kenkyūsha) |
MORI Takeshi 大阪大学, 基礎工学研究科, 研究員
OSHIO Ritz 京都大学, 大学院・情報学研究科, 研究員
SHIKAUCHI Yumi 京都大学, 大学院・情報学研究科, 技術補佐員
MORIMOTO Satoshi 京都大学, 大学院・情報学研究科, 技術補佐員
|
Project Period (FY) |
2009 – 2011
|
Project Status |
Completed (Fiscal Year 2011)
|
Budget Amount *help |
¥18,070,000 (Direct Cost: ¥13,900,000、Indirect Cost: ¥4,170,000)
Fiscal Year 2011: ¥2,730,000 (Direct Cost: ¥2,100,000、Indirect Cost: ¥630,000)
Fiscal Year 2010: ¥7,280,000 (Direct Cost: ¥5,600,000、Indirect Cost: ¥1,680,000)
Fiscal Year 2009: ¥8,060,000 (Direct Cost: ¥6,200,000、Indirect Cost: ¥1,860,000)
|
Keywords | 強化学習 / モジュールアーキテクチャ / 計算論的神経科学 / ロボット / 非侵襲脳計測 |
Research Abstract |
We have developed statistical learning models, with a particular interest in reinforcement learning(RL), which can perform decision making in uncertain and even non-stationary environments. We have derived an RL method in which value function represented by a module structure can be online and efficiently approximated by adding new modules in an incremental fashion, and an optimal learning procedure of the value function based on the framework of semi-parametric statistics. As an application, we have succeeded in automatic control of non-holonomic systems by means of a policy-based RL method. In the human brain, we have found module-like structures which are activated when inferring a hierarchical inference task. Moreover, we have succeeded in decoding inference process based on the subject's behaviors and MRI scanned images.
|