Reinforcement Learning for Environment with Dynamic Reward using Prior Knowledge

Research Project

Project/Area Number	24760308
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Multi-year Fund
Research Field	System engineering
Research Institution	University of Tsukuba
Principal Investigator	Takeshi Shibuya 筑波大学, システム情報系, 助教 (90582776)
Project Period (FY)	2012-04-01 – 2016-03-31
Project Status	Completed (Fiscal Year 2015)
Budget Amount *help	¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000) Fiscal Year 2014: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000) Fiscal Year 2013: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000) Fiscal Year 2012: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
Keywords	強化学習 / 機械学習
Outline of Final Research Achievements	The purpose of this study is to develop efficient reinforcement learning method using prior knowledge for dynamic environment. Because conventional reinforcement learning method assumes that the environment is static, it is hard to be learned. This study focus on prior knowledge to overcome this difficulty and proposes earning method for environment which has periodicity according time, direction and whose transition probabilities varies according time.

Report

(5 results)

2015 Annual Research Report Final Research Report ( PDF )
2014 Research-status Report
2013 Research-status Report
2012 Research-status Report

Research Products
(17 results)

All 2015 2014 2013 2012

All Journal Article (7 results) (of which Peer Reviewed: 6 results, Acknowledgement Compliant: 2 results) Presentation (10 results)

[Journal Article] Profit Sharing reducing the occurrences of accidents by predicted action-safety degree2015
- Author(s)
  Junki Tamaru and Takeshi Shibuya
- Journal Title
  
  Proceedings of the 10th Asian Control Conference 2015 (ASCC 2015)
  
  Volume: ASCC2015
- Related Report
  2015 Annual Research Report
- Peer Reviewed
[Journal Article] A study of efficient reinforcement learning using the relative angle of two objects2015
- Author(s)
  Moriaki Onishi and Takeshi Shibuya
- Journal Title
  
  Proceedings on the 16th International Symposium on Advanced Intelligent Systems
  
  Volume: ISIS2015 Pages: 1091-1098
- Related Report
  2015 Annual Research Report
[Journal Article] Profit Sharing reducing the occurrences of accidents by predicted action-safety degree2015
- Author(s)
  Junki Tamaru and Takeshi Shibuya
- Journal Title
  
  Proceedings of the 10th Asian Control Conference 2015
  
  Volume: ASCC2015
- Related Report
  2014 Research-status Report
- Peer Reviewed
[Journal Article] 報酬が周期的に変化する環境のための強化学習2014
- Author(s)
  澁谷長史, 安信誠二
- Journal Title
  
  電気学会論文誌C
  
  Volume: 134-9 Pages: 1325-1332
- NAID
  130004684941
- Related Report
  2014 Research-status Report
- Peer Reviewed / Acknowledgement Compliant
[Journal Article] 選択的不感化ニューラルネットを用いた連続状態行動空間におけるQ学習2014
- Author(s)
  小林高彰, 澁谷長史, 森田昌彦
- Journal Title
  
  電子情報通信学会論文誌 D
  
  Volume: J98-D Pages: 287-299
- NAID
  110008746501
- Related Report
  2014 Research-status Report
- Peer Reviewed / Acknowledgement Compliant
[Journal Article] Q-learning in Continuous State-Action Space with Redundant Dimensions Using a Selective Desensitization Neural Network2014
- Author(s)
  T. Kobayashi, T. Shibuya and M. Morita
- Journal Title
  
  Proceedings of Joint 7th International Conference on Soft Computing and Intelligent Systems and 15th International Symposium on Advanced Intelligent Systems
  
  Volume: SCIS&ISIS2014 Pages: 801-806
- Related Report
  2014 Research-status Report
- Peer Reviewed
[Journal Article] Reinforcement learning using BAMDP-based prior knowledge for dynamic environment2014
- Author(s)
  T.shibuya
- Journal Title
  
  USB Proceedings of the 11th International Conference on Modeling Decisions for Artificial Intelligence
  
  Volume: MDAI2014 Pages: 143-152
- Related Report
  2014 Research-status Report
- Peer Reviewed
[Presentation] 外部による評価を報酬に組み入れる繰り返し動作の獲得手法の一検討2015
- Author(s)
  NGUYEN VAN, BAC, 澁谷長史
- Organizer
  電気学会（システム研究会）
- Place of Presentation
  新潟県立看護大学（新潟県上越市）
- Year and Date
  2015-12-06
- Related Report
  2015 Annual Research Report
[Presentation] dotQ-learningのための位相変化量の動的な更新に関する一検討2015
- Author(s)
  作田宏行, 澁谷長史
- Organizer
  電気学会（システム研究会）
- Place of Presentation
  新潟県立看護大学（新潟県上越市）
- Year and Date
  2015-12-06
- Related Report
  2015 Annual Research Report
[Presentation] フレーム変形したロボットのための事前学習による効率的な動作獲得法の検討2015
- Author(s)
  羽鳥貴久, 澁谷長史
- Organizer
  第42回知能システムシンポジウム
- Place of Presentation
  北野プラザ六甲荘（兵庫）
- Year and Date
  2015-03-17 – 2015-03-18
- Related Report
  2014 Research-status Report
[Presentation] 障害物とエージェントの相対角を用いた効率的な強化学習法の基礎検討2015
- Author(s)
  大西杜諒, 澁谷長史
- Organizer
  電気学会研究会資料(システム研究会)
- Place of Presentation
  青山学院大学（神奈川）
- Year and Date
  2015-03-11
- Related Report
  2014 Research-status Report
[Presentation] Q-learning in Continuous State-Action Space with Redundant Dimensions Using a Selective Desensitization Neural Network2014
- Author(s)
  T. Kobayashi
- Organizer
  Joint 7th International Conference on Soft Computing and Intelligent Systems and 15th International Symposium on Advanced Intelligent Systems
- Place of Presentation
  北九州国際会議場（福岡）
- Year and Date
  2014-12-03 – 2014-12-06
- Related Report
  2014 Research-status Report
[Presentation] Reinforcement learning using BAMDP-based prior knowledge for dynamic environment2014
- Author(s)
  T.shibuya
- Organizer
  the 11th International Conference on Modeling Decisions for Artificial Intelligence
- Place of Presentation
  筑波大学東京キャンパス（東京）
- Year and Date
  2014-10-29 – 2014-10-31
- Related Report
  2014 Research-status Report
[Presentation] 事前知識を反映した状態遷移確率推定により環境変化に適応する強化学習2014
- Author(s)
  臼井翼, 澁谷長史
- Organizer
  第41回知能システムシンポジウム
- Place of Presentation
  筑波大学東京キャンパス（東京）
- Related Report
  2013 Research-status Report
[Presentation] 繰り返し状態系列から時刻依存の報酬関数を推定する逆強化学習の提案2013
- Author(s)
  田丸順基, 澁谷長史
- Organizer
  電気学会システム研究会
- Place of Presentation
  愛知県立大学サテライトキャンパス
- Related Report
  2013 Research-status Report
[Presentation] ステップごとに報酬が周期的に変化する環境における強化学習の一考察2013
- Author(s)
  澁谷長史
- Organizer
  第４０回知能システムシンポジウム
- Place of Presentation
  京都工芸繊維大学（京都府）
- Related Report
  2012 Research-status Report
[Presentation] 報酬を与えられる領域が変化する環境における強化学習2012
- Author(s)
  澁谷長史
- Organizer
  平成２４年度電気学会電子・情報・システム部門大会
- Place of Presentation
  弘前大学（青森県）
- Related Report
  2012 Research-status Report

Reinforcement Learning for Environment with Dynamic Reward using Prior Knowledge

Principal Investigator

Takeshi Shibuya 筑波大学, システム情報系, 助教 (90582776)

¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000)

Report

Research Products

[Journal Article] Profit Sharing reducing the occurrences of accidents by predicted action-safety degree2015

Author(s)

Journal Title

Related Report

[Journal Article] A study of efficient reinforcement learning using the relative angle of two objects2015

Author(s)

Journal Title

Related Report

[Journal Article] Profit Sharing reducing the occurrences of accidents by predicted action-safety degree2015

Author(s)

Journal Title

Related Report

[Journal Article] 報酬が周期的に変化する環境のための強化学習2014

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 選択的不感化ニューラルネットを用いた連続状態行動空間におけるQ学習2014

Author(s)

Journal Title

NAID

Related Report

[Journal Article] Q-learning in Continuous State-Action Space with Redundant Dimensions Using a Selective Desensitization Neural Network2014

Author(s)

Journal Title

Related Report

[Journal Article] Reinforcement learning using BAMDP-based prior knowledge for dynamic environment2014

Author(s)

Journal Title

Related Report

[Presentation] 外部による評価を報酬に組み入れる繰り返し動作の獲得手法の一検討2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] dotQ-learningのための位相変化量の動的な更新に関する一検討2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] フレーム変形したロボットのための事前学習による効率的な動作獲得法の検討2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 障害物とエージェントの相対角を用いた効率的な強化学習法の基礎検討2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Q-learning in Continuous State-Action Space with Redundant Dimensions Using a Selective Desensitization Neural Network2014

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Reinforcement learning using BAMDP-based prior knowledge for dynamic environment2014

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 事前知識を反映した状態遷移確率推定により環境変化に適応する強化学習2014

Author(s)

Organizer

Place of Presentation

Related Report

[Presentation] 繰り返し状態系列から時刻依存の報酬関数を推定する逆強化学習の提案2013

Author(s)

Organizer