2005 Fiscal Year Final Research Report Summary
Model-based reinforcement learning : brain implementation and engineering applications
Project/Area Number |
15300102
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Bioinformatics/Life informatics
|
Research Institution | Nara Institute of Science and Technology |
Principal Investigator |
ISHII Shin Nara Institute of Science and Technology, Graduate School of Information Science, Professor, 情報科学研究科, 教授 (90294280)
|
Co-Investigator(Kenkyū-buntansha) |
SHIBATA Tomohiro Nara Institute of Science and Technology, Graduate School of Information Science, Associate Professor, 情報科学研究科, 助教授 (40359873)
YOSHIDA Wako Nara Institute of Science and Technology, Graduate School of Information Science, Researcher, 情報科学研究科, 研究員 (30379599)
|
Project Period (FY) |
2003 – 2005
|
Keywords | reinforcement learning / prefrontal cortex / computational neuroscience / robot control / Bayesian inference / non-invasive brain activity measurement / system identification |
Research Abstract |
[On-line Bayesian learning schemes] We devised an on-line Bayesian learning algorithm which can be applied to Gaussian stochastic processes and can estimate the system dimensionality and change occurrence in the target dynamics (Hirayama et al., 2004). We also devised a sequential Monte-Carlo-based method which can be applied to non-Gaussian stochastic processes and applied it to visual tracking problems (Bando, et al., in press). [Applications of model-based reinforcement learning and on-line learning] We succeeded in allowing a biped robot simulator to biped-walk autonomously, based on the combination of central pattern generator and reinforcement learning. We later extended this approach such to incorporate policy-gradient-based reinforcement learning. By further introducing an on-line model identification method, the autonomous learning by the biped simulator has been accelerated (Nakamura et al., 2005). Our reinforcement learning for a switching controller succeeded in swinging-up an
… More
d stabilizing an underactuated real robot, the acrobot. An autonomous training scheme based on the combination of the model-based reinforcement learning and the on-line model learning can construct a card-game playing agent for a multi-agent card game, which is as strong as a human expert player (Ishii, et al., 2005). [Reward-related prefrontal neural activities of primates] An electrophysiological study with a primates memory-based sensorimotor processing task revealed that the reward expectation significantly enhanced the selectivity of sensory working memory but not that of motor memory (Amemori, et al., 2005). [Neuropsychological study of humans prefrontal information processing] We developed an information processing model during a human performs a Markov decision process, and evaluated the model plausibility by means of neuropsychological studies with functional magnetic resonance imaging. We found the engagement of dorsolateral prefrontal cortex (Yoshida, et al., 2005). When the Markov decision environment involves uncertainty, its resolution could be performed in front-polar prefrontal cortex (Yoshida, et al., in press). Less
|
Research Products
(68 results)