Model-based reinforcement learning : brain implementation and engineering applications

Research Project

Project/Area Number	15300102
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Research Field	Bioinformatics/Life informatics
Research Institution	Nara Institute of Science and Technology
Principal Investigator	ISHII Shin Nara Institute of Science and Technology, Graduate School of Information Science, Professor, 情報科学研究科, 教授 (90294280)
Co-Investigator(Kenkyū-buntansha)	SHIBATA Tomohiro Nara Institute of Science and Technology, Graduate School of Information Science, Associate Professor, 情報科学研究科, 助教授 (40359873) YOSHIDA Wako Nara Institute of Science and Technology, Graduate School of Information Science, Researcher, 情報科学研究科, 研究員 (30379599) 雨森賢一北海道大学, 大学院・医学研究科, 助手 (70344471)
Project Period (FY)	2003 – 2005
Project Status	Completed (Fiscal Year 2005)
Budget Amount *help	¥12,000,000 (Direct Cost: ¥12,000,000) Fiscal Year 2005: ¥2,000,000 (Direct Cost: ¥2,000,000) Fiscal Year 2004: ¥3,900,000 (Direct Cost: ¥3,900,000) Fiscal Year 2003: ¥6,100,000 (Direct Cost: ¥6,100,000)
Keywords	reinforcement learning / prefrontal cortex / computational neuroscience / robot control / Bayesian inference / non-invasive brain activity measurement / system identification / モデル同定 / ロボット / ペイズ学習 / 意思決定 / 視覚追従制御 / ワーキングメモリ / 逐次モンテカルロ法 / 機能的磁気共鳴図
Research Abstract	[On-line Bayesian learning schemes] We devised an on-line Bayesian learning algorithm which can be applied to Gaussian stochastic processes and can estimate the system dimensionality and change occurrence in the target dynamics (Hirayama et al., 2004). We also devised a sequential Monte-Carlo-based method which can be applied to non-Gaussian stochastic processes and applied it to visual tracking problems (Bando, et al., in press). [Applications of model-based reinforcement learning and on-line learning] We succeeded in allowing a biped robot simulator to biped-walk autonomously, based on the combination of central pattern generator and reinforcement learning. We later extended this approach such to incorporate policy-gradient-based reinforcement learning. By further introducing an on-line model identification method, the autonomous learning by the biped simulator has been accelerated (Nakamura et al., 2005). Our reinforcement learning for a switching controller succeeded in swinging-up an … More d stabilizing an underactuated real robot, the acrobot. An autonomous training scheme based on the combination of the model-based reinforcement learning and the on-line model learning can construct a card-game playing agent for a multi-agent card game, which is as strong as a human expert player (Ishii, et al., 2005). [Reward-related prefrontal neural activities of primates] An electrophysiological study with a primates memory-based sensorimotor processing task revealed that the reward expectation significantly enhanced the selectivity of sensory working memory but not that of motor memory (Amemori, et al., 2005). [Neuropsychological study of humans prefrontal information processing] We developed an information processing model during a human performs a Markov decision process, and evaluated the model plausibility by means of neuropsychological studies with functional magnetic resonance imaging. We found the engagement of dorsolateral prefrontal cortex (Yoshida, et al., 2005). When the Markov decision environment involves uncertainty, its resolution could be performed in front-polar prefrontal cortex (Yoshida, et al., in press). Less

Report

(4 results)

2005 Annual Research Report Final Research Report Summary
2004 Annual Research Report
2003 Annual Research Report

Research Products
(99 results)

All 2006 2005 2004 2003 Other

All Journal Article (83 results) Book (2 results) Patent(Industrial Property Rights) (3 results) Publications (11 results)

[Journal Article] Anterior and superior lateral occipito-temporal cortex responsible for target motion prediction during overt and covert visual pursuit2006
- Author(s)
  Kawawaki, D.
- Journal Title
  
  Neuroscience Research 54・2
  
  Pages: 112-123
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Anterior and superior lateral occipito-temporal cortex responsible for target motion prediction during overt and covert visual pursuit2006
- Author(s)
  Kawawaki, D.
- Journal Title
  
  Neuroscience Research 54(2)
  
  Pages: 112-123
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Model-based reinforcement learning for large-scale multi-agent games with sampling-based state estimation2006
- Author(s)
  Fujita, H.
- Journal Title
  
  The eleventh international symposium on Artificial Life and Robotics GS3-1
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Natural policy gradient reinforcement learning method for a looper-like robot2006
- Author(s)
  Nakamura, Y.
- Journal Title
  
  The eleventh international symposium on Artificial Life and Robotics GS3-3
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] A Bayesian approach to blind source separation with variable number of sources2006
- Author(s)
  Hirayama, J.
- Journal Title
  
  The eleventh international symposium on Artificial Life and Robotics GS19-6
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Reinforcement learning of switching multiple controllers to control a real robot2006
- Author(s)
  Tokita, Y.
- Journal Title
  
  The eleventh international symposium on Artificial Life and Robotics GS22-3
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Prediction of the aperiodic time series of a visual target by humans2006
- Author(s)
  Shikauchi, M.
- Journal Title
  
  The eleventh international symposium on Artificial Life and Robotics GS1-4
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Nonlinear and noisy extension of independent component analysis : theory and its application to a pitch sensation model2005
- Author(s)
  Maeda, S.
- Journal Title
  
  Neural Computation 17・1
  
  Pages: 115-144
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Model-based reinforcement learning : A computational model and an fMRI study2005
- Author(s)
  Yoshida, W.
- Journal Title
  
  Neurocomputing 63C
  
  Pages: 253-269
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Temporal reasoning about two concurrent sequences of events2005
- Author(s)
  Ishihara, Y.
- Journal Title
  
  SIAM Journal on Computing 34・2
  
  Pages: 498-513
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] A model of smooth pursuit in primates based on learning the target dynamics2005
- Author(s)
  Shibata, T.
- Journal Title
  
  Neural Networks 18・3
  
  Pages: 213-224
- NAID
  10016443900
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] A reinforcement learning scheme for a partially-observable multi-agent game2005
- Author(s)
  Ishii, S.
- Journal Title
  
  Machine Learning 59
  
  Pages: 31-54
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] 複数制御器の切替え学習法による実アクロボットの制御2005
- Author(s)
  西村政哉
- Journal Title
  
  電子情報通信学会論文誌 J88-A・5
  
  Pages: 646-657
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Annual Research Report 2005 Final Research Report Summary
[Journal Article] 強化学習の基礎理論と応用2005
- Author(s)
  吉本潤一郎
- Journal Title
  
  計測と制御 44・5
  
  Pages: 313-318
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] 強化学習 : 理論と応用2005
- Author(s)
  石井信
- Journal Title
  
  電子情報通信学会誌 88・1
  
  Pages: 804-810
- NAID
  110004046147
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] 方策こう配法に基づく強化学習法と二足歩行運動制御への応用2005
- Author(s)
  森健
- Journal Title
  
  電子情報通信学会論文誌 J88-D-II・6
  
  Pages: 1080-1089
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Annual Research Report 2005 Final Research Report Summary
[Journal Article] Aceobot control by learning the switching of multiple controllers2005
- Author(s)
  Yoshimoto, J.
- Journal Title
  
  Journal of Artifical Life and Robotics 9・2
  
  Pages: 67-71
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] 部分観測カードゲームのためのモデル同定型強化学習2005
- Author(s)
  藤田肇
- Journal Title
  
  電子情報通信学会論文誌 J88-D-II・11
  
  Pages: 2277-2287
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Annual Research Report 2005 Final Research Report Summary
[Journal Article] Off-policy natural policy grandient method for a biped walking using a CPG controller2005
- Author(s)
  Nakamura, Y.
- Journal Title
  
  Journal of Robotics and Mechatronics 17・6
  
  Pages: 636-644
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Contrasting effects of reward expectation on sensory and motor memories in primate prefrontal neurons2005
- Author(s)
  Amemori, K.
- Journal Title
  
  Cerebral Cortex doi:10.1093/cercor/bhj042
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Nonlinear and noisy extension of independent component analysis : theory and its application to a pitch sensation model2005
- Author(s)
  Maeda, S.
- Journal Title
  
  Neural Computation 17(1)
  
  Pages: 115-144
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Temporal reasoning about two concurrent sequences of events2005
- Author(s)
  Ishihara, Y.
- Journal Title
  
  SIAM Journal on Computing 34(2)
  
  Pages: 498-513
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] A model of smooth pursuit in primates based on learning the target dynamics2005
- Author(s)
  Shibata, T.
- Journal Title
  
  Neural Networks 18(3)
  
  Pages: 213-224
- NAID
  10016443900
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Acrobot control by learning the switching of multiple controllers2005
- Author(s)
  Yoshimoto, J.
- Journal Title
  
  Journal of Artificial Life and Robotics 9(2)
  
  Pages: 67-71
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Off-policy natural policy gradient method for a biped walking using a CPG controller2005
- Author(s)
  Nakamura, Y.
- Journal Title
  
  Journal of Robotics and Mechatronics 17(6)
  
  Pages: 636-644
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Contrasting effects of reward expectation on sensory and motor memories in primate prefrontal neurons2005
- Author(s)
  Amemori, K.
- Journal Title
  
  Cerebral Cortex
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Hard/soft switching particle filters for efficient real-time visual tracking2005
- Author(s)
  Bando, T.
- Journal Title
  
  Proceedings of the Tenth International Symposium on Artificial Life and Robotics GS15-5
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Gradual emergence of communication in a multi-agent environment2005
- Author(s)
  Tensho, S.
- Journal Title
  
  Proceedings of the Tenth International Symposium on Artificial Life and Robotics GS16-3
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Prediction-based optimal controls in artificial and real intelligences2005
- Author(s)
  Ishii, S.
- Journal Title
  
  Proceedings of International Symposium on The Art of Statistical Metaware
  
  Pages: 111-121
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Bayesian noisy ICA for source switching environments2005
- Author(s)
  Hirayama, J.
- Journal Title
  
  IEEE Workshop for Statistical Signal Processing 232
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Reinforcement learning of stable trajectory for quasi-passive-dynamic walking2005
- Author(s)
  Hitomi, K.
- Journal Title
  
  Modeling Natural Action Selection : Proceedings of an International Workshop
  
  Pages: 229-234
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] On-line learning of a feedback controller for quasi-passive-dynamic walking by a stochastic policy gradient method2005
- Author(s)
  Hitomi, K.
- Journal Title
  
  IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS) (IEEE)
  
  Pages: 1923-1928
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] An off-policy natural gradient method for a partial observable Markov decision process2005
- Author(s)
  Nakamura, Y.
- Journal Title
  
  Artificial Neural Networks : Formal Models and Their Applications - ICANN 2005, Lecture Notes in Computer Science 3697
  
  Pages: 431-436
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Model-based reinforcement learning for a multi-player card game with partial observability2005
- Author(s)
  Fujita, H.
- Journal Title
  
  The 2005 IEEE-WIC-ACM International Conference on Intelligent Agent Technology (IEEE)
  
  Pages: 467-470
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Localization of cyber rodent based on mixture Kalman filters2005
- Author(s)
  Magono, M.
- Journal Title
  
  Proceedings of 2005 International Symposium on Nonlinear Theory and its Applications
  
  Pages: 401-404
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Extended force/tactile senses of machines by measurement of user's biological signals2005
- Author(s)
  Nomura, T.
- Journal Title
  
  Proceedings 36th International Symposium on Robotics
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] A reinforcement learning scheme for a partially-observable multi-agent game2005
- Author(s)
  S.Ishii
- Journal Title
  
  Machine Learning 59
  
  Pages: 1-54
- Related Report
  2005 Annual Research Report
[Journal Article] Acrobot control by learning the switching of multiple controllers2005
- Author(s)
  J.Yoshimoto
- Journal Title
  
  Journal of Artificial Life and Robotics 9・2
  
  Pages: 67-71
- Related Report
  2005 Annual Research Report
[Journal Article] Contrasting effects of rewards expectation on sensory and motor memories in primate prefrontal neurons2005
- Author(s)
  K.Amemori
- Journal Title
  
  Cerebral Cortex doi:10.1093/cercor/bhj042
- Related Report
  2005 Annual Research Report
[Journal Article] Off-policy natural policy gradient method for a biped walking using a CPG controller2005
- Author(s)
  Y.Nakamura
- Journal Title
  
  Journal of Robotics and Mechatronics 17・6
  
  Pages: 636-644
- Related Report
  2005 Annual Research Report
[Journal Article] Bayesian noisy ICA for source switching environments2005
- Author(s)
  J.Hirayama
- Journal Title
  
  IEEE Workshop for Statistical Signal Processing
  
  Pages: 232-232
- Related Report
  2005 Annual Research Report
[Journal Article] On-line learning of a feedback controller for quasi-passive-dynamic walking by a stochastic policy gradient method2005
- Author(s)
  K.Hitomi
- Journal Title
  
  IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
  
  Pages: 1923-1928
- Related Report
  2005 Annual Research Report
[Journal Article] An off-policy natural gradient method for a partial observable Markov decision process2005
- Author(s)
  Y.Nakamura
- Journal Title
  
  Artificial Neural Networks: Formal Models and Their Applications - ICANN 2005 LNCS3697
  
  Pages: 431-436
- Related Report
  2005 Annual Research Report
[Journal Article] Nonlinear and noisy extension of independent component analysis : theory and its application to a pitch sensation model.2005
- Author(s)
  Maeda, S., et al.
- Journal Title
  
  Neural Computation 17・1
  
  Pages: 115-144
- Related Report
  2004 Annual Research Report
[Journal Article] Model-based reinforcement learning : A computational model and an fMRI study.2005
- Author(s)
  Yoshida, W., et al.
- Journal Title
  
  Neurocomputing 63C
  
  Pages: 253-269
- Related Report
  2004 Annual Research Report
[Journal Article] A reinforcement learning scheme for a partially-observable multi-agent game.2005
- Author(s)
  Ishii, S., et al.
- Journal Title
  
  Machine Learning 58
- Related Report
  2004 Annual Research Report
[Journal Article] Hard/soft switching particle filters for efficient real-time visual tracking.2005
- Author(s)
  Bando, T., et al.
- Journal Title
  
  Proceedings of the Tenth International Symposium on Artificial Life and Robotics
- Related Report
  2004 Annual Research Report
[Journal Article] 学習によるproduct codeの設計2004
- Author(s)
  前田新一
- Journal Title
  
  電子情報通信学会論文誌 J87-A・3
  
  Pages: 382-390
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] 強化学習の脳内機構と情動による制御2004
- Author(s)
  吉田和子
- Journal Title
  
  心理学評論 47・1
  
  Pages: 150-164
- NAID
  130007631256
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] 神経振動子ネットワークを用いた強化学習法による歩行運動の獲得2004
- Author(s)
  中村泰
- Journal Title
  
  電子情報通信学会論文誌 J87-D-II・3
  
  Pages: 893-902
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Self-organization of delay lines by spike-time-dependent learning2004
- Author(s)
  Amemori, K.
- Journal Title
  
  Neurocomputing 61
  
  Pages: 291-316
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] 部分観測環境での強化学習法とマルチエージェントゲームへの応用2004
- Author(s)
  石井信
- Journal Title
  
  システム/制御/情報 48・9
  
  Pages: 383-388
- NAID
  110003892320
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Bayesian representation learning in cortex regulates by acetylcholine2004
- Author(s)
  Hirayama J.
- Journal Title
  
  Neural Networks 17・10
  
  Pages: 1391-1400
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Bayesian representation learning in cortex regulated by acetylcholine2004
- Author(s)
  Hirayama, J.
- Journal Title
  
  Neural Networks 17(10)
  
  Pages: 1391-1400
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] A probabilistic approach to identify the environmental models of mobile robots2004
- Author(s)
  Kanemoto, K.
- Journal Title
  
  Proceedings of the Ninth International Symposium on Artificial Life and Robotics 1
  
  Pages: 329-332
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Acrobot control by learning the switching of multiple controllers2004
- Author(s)
  Nishimura, M.
- Journal Title
  
  Proceedings of the Ninth International Symposium on Artificial Life and Robotics 2
  
  Pages: 633-636
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Optimization of product code2004
- Author(s)
  Maeda, S.
- Journal Title
  
  WSEAS Transactions on Systems 2(3)
  
  Pages: 473-476
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Application of multivariate autoregression modeling for analyzing the interaction between EEG and EMG in humans2004
- Author(s)
  Shibata, T.
- Journal Title
  
  International Congress Series 1270
  
  Pages: 249-253
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] A reinforcement learning scheme for a multi-agent card game with Monte Carlo state estimation2004
- Author(s)
  Fujita, H.
- Journal Title
  
  International Conference on Computational Intelligence for Modelling Control and Automation
  
  Pages: 799-806
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Reinforcement learning for CPG-driven biped robot2004
- Author(s)
  Mori, T.
- Journal Title
  
  The Nineteenth National Conference on Artificial Intelligence (AAAI)
  
  Pages: 623-630
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] A solving method for MDPs by minimizing variational free energy2004
- Author(s)
  Yoshimoto, J.
- Journal Title
  
  International Joint Conference on Neural Networks (IJCNN) (IEEE) 3
  
  Pages: 1817-1822
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Switching particle filters for efficient real-time visual tracking2004
- Author(s)
  Bando, T.
- Journal Title
  
  International Conference on Pattern Recognition (ICPR) 2
  
  Pages: 720-723
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Cortical representation learning regulated by acetylcholine2004
- Author(s)
  Hirayama, J.
- Journal Title
  
  Brain Inspired Cognitive Systems (Stirling, Sep., 2004), ICESS 3
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Natural policy gradient reinforcement learning for a CPG control of a biped robot2004
- Author(s)
  Nakamura, Y.
- Journal Title
  
  Parallel Problem Solving from Nature - PPSN VIII, Lecture Notes in Computer Science 3242
  
  Pages: 972-981
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] A noisy nonlinear independent component analysis2004
- Author(s)
  Maeda, S.
- Journal Title
  
  2004 IEEE International Workshop on Machine Learning for Signal Processing (IEEE)
  
  Pages: 173-182
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] An imaging study on human action selection using hierarchical rule2004
- Author(s)
  Funakoshi, H.
- Journal Title
  
  The Third International Conference on Development and Learning (ICDL)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Reinforcement learning for a snake-like robot2004
- Author(s)
  Fukunaga, S.
- Journal Title
  
  Proceedings of 2004 International Symposium on Nonlinear Theory and its Applications
  
  Pages: 75-78
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Dopamine-induced depressing synapses sustain neural activities in prefrontal cortex : a simulation study2004
- Author(s)
  Igarashi, Y.
- Journal Title
  
  Proceedings of 2004 International Symposium on Nonlinear Theory and its Applications
  
  Pages: 509-512
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Reinforcement learning for a snake-like robot- controlled by a central pattern generator2004
- Author(s)
  Fukunaga, S.
- Journal Title
  
  Proceedings of the 2004 IEEE Conference on Robotics, Automation and Mechatronics (IEEE)
  
  Pages: 909-914
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] A solving method for MDPs by minimizing variational free energy.2004
- Author(s)
  Yoshimoto, J., et al.
- Journal Title
  
  International Joint Conference on Neural Networks 3
  
  Pages: 1817-1822
- Related Report
  2004 Annual Research Report
[Journal Article] Switching particle filters for efficient real-time visual tracking.2004
- Author(s)
  Bando, T., et al.
- Journal Title
  
  International Conference on Pattern Recognition 2
  
  Pages: 720-723
- Related Report
  2004 Annual Research Report
[Journal Article] Cortical representation learning regulated by acetylcholine.2004
- Author(s)
  Hirayama, J., et al.
- Journal Title
  
  Brain Inspired Cognitive Systems
- Related Report
  2004 Annual Research Report
[Journal Article] A noisy nonlinear independent component analysis.2004
- Author(s)
  Maeda, S., et al.
- Journal Title
  
  IEEE International Workshop on Machine Learning for Signal Processing
  
  Pages: 173-182
- Related Report
  2004 Annual Research Report
[Journal Article] An imaging study on human action selection using hierarchical rule.2004
- Author(s)
  Funakoshi, H., et al.
- Journal Title
  
  The Third International Conference on Development and Learning
- Related Report
  2004 Annual Research Report
[Journal Article] Bayesian representation learning in cortex regulated by acetylcholine.2004
- Author(s)
  Hirayama, J., et al.
- Journal Title
  
  Neural Networks 17・10
  
  Pages: 1391-1400
- Related Report
  2004 Annual Research Report
[Journal Article] 連続力学システムの自動制御のためのオンラインEM強化学習法2003
- Author(s)
  吉本潤一郎
- Journal Title
  
  システム制御情報学会論文誌 16・5
  
  Pages: 209-217
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] 変文法的ベイズ推定法に基づく正規法ガウス関数ネットワークと階層的モデル選択法2003
- Author(s)
  吉本潤一郎
- Journal Title
  
  計測自動制御学会論文誌 39・5
  
  Pages: 503-512
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] A model-based reinforcement learning : a computational model and an fMRI study2003
- Author(s)
  Yoshida, W.
- Journal Title
  
  11th European Symposium on Artificial Neural Networks, Belgium : d-side publications
  
  Pages: 313-318
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] System identification based on on-line variational Bayes method and its application to reinforcement learning2003
- Author(s)
  Yoshimoto, J.
- Journal Title
  
  Artificial Neural Networks and Neural Information Processing, Lecture Notes in Computer Science (Berlin : Springer-Verlag) 2714
  
  Pages: 123-131
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Prior hyperparameters in Bayesian PCA.2003
- Author(s)
  Oba, S.
- Journal Title
  
  Artificial Neural Networks and Neural Information Processing, Lecture Notes in Computer Science (Berlin : Springer-Verlag) 2714
  
  Pages: 271-279
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] A reinforcement learning scheme for a multi-agent card game2003
- Author(s)
  Fujita, H.
- Journal Title
  
  2003 IEEE International Conference on Systems, Man & Cybernetics
  
  Pages: 4071-4078
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Balancing plasticity and stability of on-line learning based on hierarchical Bayesian adaptation of forgetting factors
- Author(s)
  J.Hirayama
- Journal Title
  
  Neurocomputing (to appear)
- Related Report
  2005 Annual Research Report
[Journal Article] 方策オフ型Natural Actor-Critic法
- Author(s)
  森健
- Journal Title
  
  電子情報通信学会論文誌 (to appear)
- Related Report
  2005 Annual Research Report
[Book] 脳の計算機構「-ボトムアップ・トップダウンのダイナミクス-」,分担執筆(3章,pp.18-38)2005
- Author(s)
  佐藤雅昭, 石井信
- Total Pages
  21
- Publisher
  朝倉書店
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Book] 脳の計算機構「-ボトムアップ・トップダウンのダイナミクス-」2005
- Author(s)
  佐藤雅昭
- Total Pages
  21
- Publisher
  朝倉書店
- Related Report
  2005 Annual Research Report
[Patent(Industrial Property Rights)] 歪みあり符号方法及び装置、符号化方式プログラム及び記憶媒体2004
- Inventor(s)
  前田新一, 石井信
- Industrial Property Rights Holder
  独立行政法人科学技術振興機構
- Industrial Property Number
  2004-004200
- Filing Date
  2004-01-09
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Patent(Industrial Property Rights)] 制御装置およびプログラム2004
- Inventor(s)
  石井信, 中村泰, 麻生和昭
- Industrial Property Rights Holder
  奈良先端科学技術大学院大学, トヨタ自動車株式会社
- Industrial Property Number
  2004-267307
- Filing Date
  2004-09-14
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary 2004 Annual Research Report
[Patent(Industrial Property Rights)] 適応型制御器、適応型制御方法および適応型制御プログラム2003
- Inventor(s)
  吉本潤一郎, 石井信
- Industrial Property Rights Holder
  科学技術振興事業団, 奈良先端科学技術大学院大学
- Industrial Property Number
  2003-314621
- Filing Date
  2003-09-05
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Publications] 吉本潤一郎: "連続力学システムの自動制御のためのオンラインEM強化学習法"システム制御情報学会論文誌. 16・5. 209-217 (2003)
- Related Report
  2003 Annual Research Report
[Publications] 吉本潤一郎: "変分法的ベイズ推定法に基づく正規化ガウス関数ネットワークと階層的モデル選択法"計測自動制御学会論文集. 39・5. 503-512 (2003)
- Related Report
  2003 Annual Research Report
[Publications] 中村泰: "神経振動子ネットワークを用いた強化学習法による歩行運動の獲得"電子情報通信学会論文誌. J87-D-II・3. 893-902 (2004)
- Related Report
  2003 Annual Research Report
[Publications] Amemori, K.: "Self-organization of delay lines by spike-time-dependent learning"Neurocomputing. (to appear).
- Related Report
  2003 Annual Research Report
[Publications] 石井信: "制御理論・強化学習への展開"数理科学. 489. 38-45 (2004)
- Related Report
  2003 Annual Research Report
[Publications] Yoshida, W.: "A model-based reinforcement learning : a computational model and an fMRI study"11th European Symposium on Artificial Neural Networks. 313-318 (2003)
- Related Report
  2003 Annual Research Report
[Publications] Yoshimoto, J.: "System identification based on on-line variational Bayes method and its application to reinforcement learning"Artificial Neural Networks and Neural Information Processing. LNCS2714. 123-131 (2003)
- Related Report
  2003 Annual Research Report
[Publications] Fujita, H.: "A reinforcement learning scheme for a multi-agent card game"IEEE International Conference on Systems, Man & Cybernetics. 4071-4078 (2003)
- Related Report
  2003 Annual Research Report
[Publications] Amemori, K.: "Neuronal representations in the primate dorsolateral prefrontal cortex during_memory-guided sensorimotor_transformation process"Neuroscience Research. 46・Supplement 1. S195 (2003)
- Related Report
  2003 Annual Research Report
[Publications] Kanemoto, K.: "A probabilistic approach to identify the environmental models of mobile robots"Proceedings of the Ninth International Symposium on Artificial Life and Robotics. 1. 329-332 (2004)
- Related Report
  2003 Annual Research Report
[Publications] Nishimura, M.: "Acrobot control by learning the switching of multiple controllers"Proceedings of the Ninth International Symposium on Artificial Life and Robotics. 2. 633-636 (2004)
- Related Report
  2003 Annual Research Report

Model-based reinforcement learning : brain implementation and engineering applications

Principal Investigator

ISHII Shin Nara Institute of Science and Technology, Graduate School of Information Science, Professor, 情報科学研究科, 教授 (90294280)

¥12,000,000 (Direct Cost: ¥12,000,000)

Report

Research Products

[Journal Article] Anterior and superior lateral occipito-temporal cortex responsible for target motion prediction during overt and covert visual pursuit2006

Author(s)

Journal Title

Description

Related Report

[Journal Article] Anterior and superior lateral occipito-temporal cortex responsible for target motion prediction during overt and covert visual pursuit2006

Author(s)

Journal Title

Description

Related Report

[Journal Article] Model-based reinforcement learning for large-scale multi-agent games with sampling-based state estimation2006

Author(s)

Journal Title

Description

Related Report

[Journal Article] Natural policy gradient reinforcement learning method for a looper-like robot2006

Author(s)

Journal Title

Description

Related Report

[Journal Article] A Bayesian approach to blind source separation with variable number of sources2006

Author(s)

Journal Title

Description

Related Report

[Journal Article] Reinforcement learning of switching multiple controllers to control a real robot2006

Author(s)

Journal Title

Description

Related Report

[Journal Article] Prediction of the aperiodic time series of a visual target by humans2006

Author(s)

Journal Title

Description

Related Report

[Journal Article] Nonlinear and noisy extension of independent component analysis : theory and its application to a pitch sensation model2005

Author(s)

Journal Title

Description

Related Report

[Journal Article] Model-based reinforcement learning : A computational model and an fMRI study2005

Author(s)

Journal Title

Description

Related Report

[Journal Article] Temporal reasoning about two concurrent sequences of events2005

Author(s)

Journal Title

Description

Related Report

[Journal Article] A model of smooth pursuit in primates based on learning the target dynamics2005

Author(s)

Journal Title

NAID

Description

Related Report

[Journal Article] A reinforcement learning scheme for a partially-observable multi-agent game2005

Author(s)

Journal Title

Description

Related Report

[Journal Article] 複数制御器の切替え学習法による実アクロボットの制御2005

Author(s)

Journal Title

Description

Related Report

[Journal Article] 強化学習の基礎理論と応用2005

Author(s)

Journal Title

Description

Related Report

[Journal Article] 強化学習 : 理論と応用2005

Author(s)

Journal Title