A Study on Reinforcement Learning with Knowledge

Research Project

Project/Area Number	10680372
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	Tokyo Institute of Technology
Principal Investigator	YAMAMURA Masayuki Tokyo Institute of Technology, Interdisciplinary Graduate School of Science and Engineering, Associate Professor, 大学院・総合理工学研究科, 助教授 (00220442)
Project Period (FY)	1998 – 1999
Project Status	Completed (Fiscal Year 1999)
Budget Amount *help	¥3,300,000 (Direct Cost: ¥3,300,000) Fiscal Year 1999: ¥1,200,000 (Direct Cost: ¥1,200,000) Fiscal Year 1998: ¥2,100,000 (Direct Cost: ¥2,100,000)
Keywords	reinforcement learning / Bayesian network / stochastic gradient method / Kepera robot simularot / lifelong learning / bidirectional AntNet / multiagent reinforcement learning / traffic signal control / Life long Learning / タスク連結
Research Abstract	Results of this project consists of following three groups; (1) Reinforcement learning on Bayesian network We derived a set of propagation rules for the stochastic gradient method, which is a kind of reinforcement learning methods, from belief propagation rules of Bayesian network. We also applied it for robot navigation tasks on Kepera robot simulator to incorporate a priori knowledge such as a map. (2) Lifelong reinforcement learning We extended the framework of lifelong learning into reinforcement learning. Since a lifelong agent faces multiple tasks which share some invariant properties, previous experiences would help performing future tasks. We confirmed its effects on robot navigation tasks. We also derived some mathematical theorem in continuous world. (3) Multiagent reinforcement learning in open world We tried to explore new frontier of reinforcement learning into multiagent systems. We analized dynamical behavior of distributed adaptive controler for traffic signal systems. We also proposed bidirectional AntNet for adaptive routing for computer networks, and realized better performance than existing works.

Report

(3 results)

1999 Annual Research Report Final Research Report Summary
1998 Annual Research Report

Research Products
(23 results)

All Other

All Publications (23 results)

[Publications] Masayuki Yamamura, Takashi Onozuka: "Reinforcement Learning with Knowledge by using a Stochastic Gradient Method on a Bayesian Network"Proceedings of International Joint Conference on Neural Networks 1998. 2045-2050 (1998)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] 田中文英,山村雅幸: "Lifelong agentの強化学習"ロボティクス・メカトロニクス講演会'98(ROBOMEC'98). (1998)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] 宮下洋,山村雅幸: "強化学習における習得済み政策の連結手法,"計測自動制御学会第26回知能システムシンポジウム資料集,. 121-126 (1999)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] 小野塚卓,山村雅幸: "ベイジアンネットワーク上の強化学習のケペラロボットシミュレータへの応用"計測自動制御学会第26回知能システムシンポジウム資料集,. 127-132 (1999)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] 吉田功,山村雅幸: "交通システムにおける適応的信号制御"計測自動制御学会第26回知能システムシンポジウム資料集,. 157-162 (1999)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] 土居茂雄,山村雅幸: "BntNetによるネットワーク経路制御の提案,"計測自動制御学会システム情報部門シンポジウム1999講演論文集. 215-220 (1999)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] Masayuki Yamamura, Takashi Onozuka: "Reinforcement Learning with Knowledge by sing a Stochastic Gradient Method on a Bayesian Network"Proceedings of International Joint Conference on Neural Networks 1998 (IJCNN98). 2045-2050 (1998)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] Fumihide Tanaka, Masayuki Yamamura: "Reinforcement Learning of Lifelong Agents (in Japanese)"Proceedings of Robotics and Mechatoronics Conference 98 (ROBOMEC98). (1998)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] Hiroshi Miyashita, Masayuki Yamamura: "An Analysis on Connecting Learned Policies in Multitask Reinforcement Learning (in Japanese)"Proceedings of the 26th SICE Intelligent Systems Symposium. 121-126 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] Takashi Onozuka, Masayuki Yamamura: "An Application of Reinforcement Learning on Bayesian Network for Kepera Robot Simulators (in Japanese)"Proceedings of the 26th SICE Intelligent Systems Symposium. 127-132 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] Isao Yoshida, Masayuki Yamamura: "A Study of Adaptive Signal Control on Traffic systems (in Japanese)"Proceedings of the 26th SICE Intelligent Systems Symposium. 157-162 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] Shigeo Doi, Masayuki Yamamura: "Adaptive routing by BntNet (in Japanese)"Proceedings of the SICE Systems and Informatics Department Symposium 1999. 215-220 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] Masayuki Yamamura,Takashi Onozuka: "Reinforcement Learning with Knowledge by using a Stochastic Gradient Method on a Bayesian Network,"Proceedings of International Joint Conference on Neural Networks 1998,. 2045-2050 (1998)
- Related Report
  1999 Annual Research Report
[Publications] 田中文秀,山村雅幸: "Lifelong agent の強化学習,"ロボティクス・メカトロニクス講演会'98(ROBOMEC'98). (1998)
- Related Report
  1999 Annual Research Report
[Publications] 宮下洋,山村雅幸,: "強化学習における習得済み政策の連結手法,"計測自動制御学会第26回知能システムシンポジウム資料集,. 121-126 (1999)
- Related Report
  1999 Annual Research Report
[Publications] 小野塚卓,山村雅幸,: "ベイジアンネットワーク上の強化学習のケペラロボットシミュレータへの応用"計測自動制御学会第26回知能システムシンポジウム資料集,. 127-132 (1999)
- Related Report
  1999 Annual Research Report
[Publications] 吉田功,山村雅幸,: "交通システムにおける適応的信号制御,"計測自動制御学会第26回知能システムシンポジウム資料集,. 157-162 (1999)
- Related Report
  1999 Annual Research Report
[Publications] 土居茂雄,山村雅幸: "BntNet によるネットワーク経路制御の提案,"計測自動制御学会システム情報部門シンポジウム1999講演論文集,. 215-220 (1999)
- Related Report
  1999 Annual Research Report
[Publications] Yamamura,M.,Onozuka,T.: "Reinforcement Learning with Knowiedge by using a Stochastic Gradient Method on a Bayesian Network" Proc.of International Joint Conference on Neural Network. 2045-2050 (1998)
- Related Report
  1998 Annual Research Report
[Publications] 小野塚卓、山村雅幸: "ベイジアンネットワーク上の強化学習のケペラロボットシミュレータへの応用" 計測自動制御学会第26回知能システムシンポジウム予稿集. (印刷中). (1999)
- Related Report
  1998 Annual Research Report
[Publications] 田中文英、山村雅幸: "Lifelong agentの強化学習" ロボティクス・メカトロニクス講演会'98(ROBOMEC98)予稿集. (CD ROM). (1998)
- Related Report
  1998 Annual Research Report
[Publications] 宮下洋、山村雅幸: "強化学習における習得済タスクの連結手法" 計測自動制御学会第26回知能システムシンポジウム予稿集. (印刷中). (1999)
- Related Report
  1998 Annual Research Report
[Publications] 吉田功、山村雅幸: "交通システムにおける適応的信号制御" 計測自動制御学会第26回知能システムシンポジウム予稿集. (印刷中). (1999)
- Related Report
  1998 Annual Research Report

A Study on Reinforcement Learning with Knowledge

Principal Investigator

YAMAMURA Masayuki Tokyo Institute of Technology, Interdisciplinary Graduate School of Science and Engineering, Associate Professor, 大学院・総合理工学研究科, 助教授 (00220442)

¥3,300,000 (Direct Cost: ¥3,300,000)

Report

Research Products

[Publications] Masayuki Yamamura, Takashi Onozuka: "Reinforcement Learning with Knowledge by using a Stochastic Gradient Method on a Bayesian Network"Proceedings of International Joint Conference on Neural Networks 1998. 2045-2050 (1998)

Description

Related Report

[Publications] 田中文英,山村雅幸: "Lifelong agentの強化学習"ロボティクス・メカトロニクス講演会'98(ROBOMEC'98). (1998)

Description

Related Report

[Publications] 宮下洋,山村雅幸: "強化学習における習得済み政策の連結手法,"計測自動制御学会第26回知能システムシンポジウム資料集,. 121-126 (1999)

Description

Related Report

[Publications] 小野塚卓,山村雅幸: "ベイジアンネットワーク上の強化学習のケペラロボットシミュレータへの応用"計測自動制御学会第26回知能システムシンポジウム資料集,. 127-132 (1999)

Description

Related Report

[Publications] 吉田功,山村雅幸: "交通システムにおける適応的信号制御"計測自動制御学会第26回知能システムシンポジウム資料集,. 157-162 (1999)

Description

Related Report

[Publications] 土居茂雄,山村雅幸: "BntNetによるネットワーク経路制御の提案,"計測自動制御学会 システム情報部門シンポジウム1999講演論文集. 215-220 (1999)

Description

Related Report

[Publications] Masayuki Yamamura, Takashi Onozuka: "Reinforcement Learning with Knowledge by sing a Stochastic Gradient Method on a Bayesian Network"Proceedings of International Joint Conference on Neural Networks 1998 (IJCNN98). 2045-2050 (1998)

Description

Related Report

[Publications] Fumihide Tanaka, Masayuki Yamamura: "Reinforcement Learning of Lifelong Agents (in Japanese)"Proceedings of Robotics and Mechatoronics Conference 98 (ROBOMEC98). (1998)

Description

Related Report

[Publications] Hiroshi Miyashita, Masayuki Yamamura: "An Analysis on Connecting Learned Policies in Multitask Reinforcement Learning (in Japanese)"Proceedings of the 26th SICE Intelligent Systems Symposium. 121-126 (1999)

Description

Related Report

[Publications] Takashi Onozuka, Masayuki Yamamura: "An Application of Reinforcement Learning on Bayesian Network for Kepera Robot Simulators (in Japanese)"Proceedings of the 26th SICE Intelligent Systems Symposium. 127-132 (1999)

Description

Related Report

[Publications] Isao Yoshida, Masayuki Yamamura: "A Study of Adaptive Signal Control on Traffic systems (in Japanese)"Proceedings of the 26th SICE Intelligent Systems Symposium. 157-162 (1999)

Description

Related Report

[Publications] Shigeo Doi, Masayuki Yamamura: "Adaptive routing by BntNet (in Japanese)"Proceedings of the SICE Systems and Informatics Department Symposium 1999. 215-220 (1999)

Description

Related Report

[Publications] Masayuki Yamamura,Takashi Onozuka: "Reinforcement Learning with Knowledge by using a Stochastic Gradient Method on a Bayesian Network,"Proceedings of International Joint Conference on Neural Networks 1998,. 2045-2050 (1998)

Related Report

[Publications] 田中文秀,山村雅幸: "Lifelong agent の強化学習,"ロボティクス・メカトロニクス講演会'98(ROBOMEC'98). (1998)

Related Report

[Publications] 宮下洋,山村雅幸,: "強化学習における習得済み政策の連結手法,"計測自動制御学会第26回知能システムシンポジウム資料集,. 121-126 (1999)

Related Report

[Publications] 小野塚卓,山村雅幸,: "ベイジアンネットワーク上の強化学習のケペラロボットシミュレータへの応用"計測自動制御学会第26回知能システムシンポジウム資料集,. 127-132 (1999)

Related Report

[Publications] 吉田功,山村雅幸,: "交通システムにおける適応的信号制御,"計測自動制御学会第26回知能システムシンポジウム資料集,. 157-162 (1999)

Related Report

[Publications] 土居茂雄,山村雅幸: "BntNet によるネットワーク経路制御の提案,"計測自動制御学会システム情報部門シンポジウム1999講演論文集,. 215-220 (1999)

Related Report

[Publications] Yamamura,M.,Onozuka,T.: "Reinforcement Learning with Knowiedge by using a Stochastic Gradient Method on a Bayesian Network" Proc.of International Joint Conference on Neural Network. 2045-2050 (1998)

Related Report

[Publications] 小野塚 卓、山村雅幸: "ベイジアンネットワーク上の強化学習のケペラロボットシミュレータへの応用" 計測自動制御学会第26回知能システムシンポジウム予稿集. (印刷中). (1999)

Related Report

[Publications] 田中文英、山村雅幸: "Lifelong agentの強化学習" ロボティクス・メカトロニクス講演会'98(ROBOMEC98)予稿集. (CD ROM). (1998)

Related Report

[Publications] 宮下洋、山村雅幸: "強化学習における習得済タスクの連結手法" 計測自動制御学会第26回知能システムシンポジウム予稿集. (印刷中). (1999)

Related Report

[Publications] 吉田功、山村雅幸: "交通システムにおける適応的信号制御" 計測自動制御学会第26回知能システムシンポジウム予稿集. (印刷中). (1999)

Related Report

[Publications] 土居茂雄,山村雅幸: "BntNetによるネットワーク経路制御の提案,"計測自動制御学会システム情報部門シンポジウム1999講演論文集. 215-220 (1999)

[Publications] 小野塚卓、山村雅幸: "ベイジアンネットワーク上の強化学習のケペラロボットシミュレータへの応用" 計測自動制御学会第26回知能システムシンポジウム予稿集. (印刷中). (1999)