2010 Fiscal Year Annual Research Report

経験強化型学習XoLに関する発展的研究

Research Project

Project/Area Number	22500143
Research Institution	National Institution for Academic Degrees and University Evaluation
Principal Investigator	宮崎和光独立行政法人大学評価・学位授与機構, 学位審査研究部, 准教授 (20282866)
Keywords	強化学習 / 機械学習 / 知能機械 / エージェント / 経験強化型学習
Research Abstract	平成22年度においては、当初の予定通り、連続入出力に対応したXoLを満たす手法の提案を行った。当該研究成果は、国際会議(Miyazaki, K., The Penalty Avoiding Rational Policy Making algorithm in Continuous Action Spaces, 11th International Conference on Intelligent Data Engineering and Automated Learning, pp.178-185, 2010)において発表を行った。そこでは、2007年に提案した連続入力に対応した罰回避政策形成アルゴリズム(PARP)(Miyazaki, K. and Kobayashi, S., A Reinforcement Learning System for Penalty Avoiding in Continuous State Spaces, Journal of Advanced Computational Intelligence and Intelligent Informatics, Vol.11, No.6, pp.668-676, 2007)に対し、連続行動に適した独自の行動選択方法を組み合わせることで、多様な行動の生成を可能にした。また、倒立振子の振り上げ安定化問題に適用することで、提案手法の有効性を確認した。このことは、報酬と罰が各々高々1種類の場合のXoLの基本的手法が確立されたことを意味する。また、当該研究成果は、平成23年度以降に行う予定である「複数種類の報酬と罰への対応」「XoLの応用例の探求」「報酬と罰の設計指針の確立」等の研究の進展に大きく寄与するものであると考える。

Research Products
(12 results)

All 2011 2010 Other

All Journal Article (5 results) (of which Peer Reviewed: 3 results) Presentation (5 results) Book (1 results) Remarks (1 results)

[Journal Article] 経験強化型学習を利用した学位授与事業のための科目分類支援システムの提案2011
- Author(s)
  宮崎和光, 井田正明
- Journal Title
  
  第38回知能システムシンポジウム予稿集
  
  Pages: 123-128
[Journal Article] The Penalty Avoiding Rational Policy Making algorithm in Continuous Action Spaces2010
- Author(s)
  Miyazaki, K.
- Journal Title
  
  Proceedings of the 11th International Conference on Intelligent Data Engineering and Automated Learning
  
  Pages: 178-185
- Peer Reviewed
[Journal Article] Threshold Learning in the Improved Penalty Avoiding Rational Policy Making Algorithm2010
- Author(s)
  Miyazaki, K., Kobayashi, J., Kobayashi, H.
- Journal Title
  
  Proceedings of the SICE Annual Conference 2010
  
  Pages: 3240-3245
- Peer Reviewed
[Journal Article] Automatic Tuning of Judgement Parameter in Continuous State Exploitation-oriented Learning2010
- Author(s)
  Miyazaki, K.
- Journal Title
  
  Proceedings of the SICE Annual Conference 2010
  
  Pages: 3246-3249
- Peer Reviewed
[Journal Article] マルチエージェント連続タスクへの改良型罰回避政策形成アルゴリズムの適用とサッカーロボットを用いた実験による評価2010
- Author(s)
  伊藤昌樹, 宮崎和光, 小林博明
- Journal Title
  
  第53回自動制御連合講演会論文集
  
  Pages: 4
[Presentation] 経験強化型学習を利用した学位授与事業のための科目分類支援システムの提案2011
- Author(s)
  宮崎和光
- Organizer
  第38回知能システムシンポジウム
- Place of Presentation
  IS38wiki講演会(インターネット上)(大震災のため)
- Year and Date
  20110323-20110325
[Presentation] マルチエージェント連続タスクへの改良型罰回避政策形成アルゴリズムの適用とサッカーロボットを用いた実験による評価2010
- Author(s)
  伊藤昌樹
- Organizer
  第53回自動制御連合講演会
- Place of Presentation
  高知城ホール
- Year and Date
  2010-11-04
[Presentation] The Penalty Avoiding Rational Policy Making algorithm in Continuous Action Spaces2010
- Author(s)
  Miyazaki, K.
- Organizer
  11th International Conference on Intelligent Data Engineering and Automated Learning
- Place of Presentation
  University of the West of Scotland
- Year and Date
  2010-09-01
[Presentation] Threshold Learning in the Improved Penalty Avoiding Rational Policy Making Algorithm2010
- Author(s)
  Miyazaki, K.
- Organizer
  SICE Annual Conference 2010
- Place of Presentation
  Gland Hotel, Taipei, Taiwan
- Year and Date
  2010-08-21
[Presentation] Automatic Tuning of Judgement Parameter in Continuous State Exploitation-oriented Learning2010
- Author(s)
  Miyazaki, K.
- Organizer
  SICE Annual Conference 2010
- Place of Presentation
  Gland Hotel, Taipei, Taiwan
- Year and Date
  2010-08-21
[Book] Exploitation-oriented Learning XoL-A new approach to machine learning based on trial-and-error searches-(Chapter 15)(Multi-Agent Applications with Evolutionary Computational and Biologically Inspired Technologies : Intelligent Techniques for Ubiquity and Optimization)(Kambayashi, Y.(Ed.))2010
- Author(s)
  Miyazaki, K.
- Total Pages
  267-293
- Publisher
  IGI Global
[Remarks]
- URL
  http://svrrd2.niad.ac.jp/faculty/teru/xol_s.html

2010 Fiscal Year Annual Research Report

経験強化型学習XoLに関する発展的研究

Principal Investigator

宮崎 和光 独立行政法人大学評価・学位授与機構, 学位審査研究部, 准教授 (20282866)

Research Products

[Journal Article] 経験強化型学習を利用した学位授与事業のための科目分類支援システムの提案2011

Author(s)

Journal Title

[Journal Article] The Penalty Avoiding Rational Policy Making algorithm in Continuous Action Spaces2010

Author(s)

Journal Title

[Journal Article] Threshold Learning in the Improved Penalty Avoiding Rational Policy Making Algorithm2010

Author(s)

Journal Title

[Journal Article] Automatic Tuning of Judgement Parameter in Continuous State Exploitation-oriented Learning2010

Author(s)

Journal Title

[Journal Article] マルチエージェント連続タスクへの改良型罰回避政策形成アルゴリズムの適用とサッカーロボットを用いた実験による評価2010

Author(s)

Journal Title

[Presentation] 経験強化型学習を利用した学位授与事業のための科目分類支援システムの提案2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] マルチエージェント連続タスクへの改良型罰回避政策形成アルゴリズムの適用とサッカーロボットを用いた実験による評価2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] The Penalty Avoiding Rational Policy Making algorithm in Continuous Action Spaces2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Threshold Learning in the Improved Penalty Avoiding Rational Policy Making Algorithm2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Automatic Tuning of Judgement Parameter in Continuous State Exploitation-oriented Learning2010

Author(s)

Organizer

Place of Presentation

Year and Date

Author(s)

Total Pages

Publisher

[Remarks]

URL

宮崎和光独立行政法人大学評価・学位授与機構, 学位審査研究部, 准教授 (20282866)