Project/Area Number |
15300102
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Bioinformatics/Life informatics
|
Research Institution | Nara Institute of Science and Technology |
Principal Investigator |
ISHII Shin Nara Institute of Science and Technology, Graduate School of Information Science, Professor, 情報科学研究科, 教授 (90294280)
|
Co-Investigator(Kenkyū-buntansha) |
SHIBATA Tomohiro Nara Institute of Science and Technology, Graduate School of Information Science, Associate Professor, 情報科学研究科, 助教授 (40359873)
YOSHIDA Wako Nara Institute of Science and Technology, Graduate School of Information Science, Researcher, 情報科学研究科, 研究員 (30379599)
雨森 賢一 北海道大学, 大学院・医学研究科, 助手 (70344471)
|
Project Period (FY) |
2003 – 2005
|
Project Status |
Completed (Fiscal Year 2005)
|
Budget Amount *help |
¥12,000,000 (Direct Cost: ¥12,000,000)
Fiscal Year 2005: ¥2,000,000 (Direct Cost: ¥2,000,000)
Fiscal Year 2004: ¥3,900,000 (Direct Cost: ¥3,900,000)
Fiscal Year 2003: ¥6,100,000 (Direct Cost: ¥6,100,000)
|
Keywords | reinforcement learning / prefrontal cortex / computational neuroscience / robot control / Bayesian inference / non-invasive brain activity measurement / system identification / モデル同定 / ロボット / ペイズ学習 / 意思決定 / 視覚追従制御 / ワーキングメモリ / 逐次モンテカルロ法 / 機能的磁気共鳴図 |
Research Abstract |
[On-line Bayesian learning schemes] We devised an on-line Bayesian learning algorithm which can be applied to Gaussian stochastic processes and can estimate the system dimensionality and change occurrence in the target dynamics (Hirayama et al., 2004). We also devised a sequential Monte-Carlo-based method which can be applied to non-Gaussian stochastic processes and applied it to visual tracking problems (Bando, et al., in press). [Applications of model-based reinforcement learning and on-line learning] We succeeded in allowing a biped robot simulator to biped-walk autonomously, based on the combination of central pattern generator and reinforcement learning. We later extended this approach such to incorporate policy-gradient-based reinforcement learning. By further introducing an on-line model identification method, the autonomous learning by the biped simulator has been accelerated (Nakamura et al., 2005). Our reinforcement learning for a switching controller succeeded in swinging-up an
… More
d stabilizing an underactuated real robot, the acrobot. An autonomous training scheme based on the combination of the model-based reinforcement learning and the on-line model learning can construct a card-game playing agent for a multi-agent card game, which is as strong as a human expert player (Ishii, et al., 2005). [Reward-related prefrontal neural activities of primates] An electrophysiological study with a primates memory-based sensorimotor processing task revealed that the reward expectation significantly enhanced the selectivity of sensory working memory but not that of motor memory (Amemori, et al., 2005). [Neuropsychological study of humans prefrontal information processing] We developed an information processing model during a human performs a Markov decision process, and evaluated the model plausibility by means of neuropsychological studies with functional magnetic resonance imaging. We found the engagement of dorsolateral prefrontal cortex (Yoshida, et al., 2005). When the Markov decision environment involves uncertainty, its resolution could be performed in front-polar prefrontal cortex (Yoshida, et al., in press). Less
|