2012 Fiscal Year Final Research Report

Progressive research on the exploitation-oriented learning XoL

Research Project

Project/Area Number	22500143
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	National Institution for Academic Degrees and University Evaluation
Principal Investigator	MIYAZAKI Kazuteru 独立行政法人大学評価・学位授与機構, 研究開発部, 准教授 (20282866)
Project Period (FY)	2010 – 2012
Keywords	経験強化型学習 / 強化学習 / 報酬と罰の設計指針
Research Abstract	This research has completed an Exploitation-oriented Learning (XoL) method that can treat multiple rewards and penalties. Furthermore the design guideline of rewards and penalties on the XoL method has been proposed through illustrative examples, namely, a course classification task, a waist-trajectory learning task for a tendon-driven biped robot, and a Keepaway task in a multi-agent environment. It claim that XoL surpass traditional Reinforcement Learning based on Dynamic Programming in application to real-world problem.

Research Products
(8 results)

All 2012 2011 2010 Other

All Journal Article (3 results) (of which Peer Reviewed: 2 results) Presentation (3 results) Book (1 results) Remarks (1 results)

[Journal Article] リレー解説強化学習の最近の発展「第5回:応用志向の試行錯誤に基づく目的指向学習」Exploitation-oriented Learning;XoL2012
- Author(s)
  宮崎和光
- Journal Title
  
  計測と制御
  
  Volume: Vol.52, No.5 Pages: 462-467
[Journal Article] Introduction of Fixed Mode States into Online Reinforcement Learning with Penalty and Reward and Its Application to Waist Trajectory Generation of Biped Robot2012
- Author(s)
  Seiya Kuroda, Kazuteru Miyazaki and Hiroaki Kobayashi
- Journal Title
  
  Journal of Advanced Computational Intelligence and Intelligent Informatics
  
  Volume: Vol.16, No.6 Pages: 758-768
- Peer Reviewed
[Journal Article] Propocal of the Continuous-Valued Penalty Avoiding Rational Policy Making Algorithm2012
- Author(s)
  Kazuteru Miyazaki
- Journal Title
  
  Journal of Advanced Computational Intelligence and Intelligent Informatics
  
  Volume: Vol.16, No.2 Pages: 183-190
- Peer Reviewed
[Presentation] Proposal of an Exploitation-oriented Learning Method on Multiple Rewards and Penalties Environments2012
- Author(s)
  Kazuteru Miyazaki
- Organizer
  The 2nd International Conference on Applied and Theoretical Information Systems Research (2nd ATISR)
- Place of Presentation
  圓山大販店,台湾
- Year and Date
  2012-12-29
[Presentation] 複数種類の報酬と罰に対応した経験強化型学習の提案と設計指針に関する研究2012
- Author(s)
  宮崎和光
- Organizer
  平成24年度電気学会電子・情報・システム部門大会
- Place of Presentation
  弘前大学
- Year and Date
  2012-09-07
[Presentation] Proposal and Evaluation of the Active Course Classification Support System with Exploitation-oriented Learning2011
- Author(s)
  Kazuteru Miyazaki
- Organizer
  The 9th European Workshop on Reinforcement Learning (EWRL-9)
- Place of Presentation
  Athens Royal Olympic Hotel,ギリシャ
- Year and Date
  2011-09-11
[Book] Exploitation-oriented Learning XoL - A new approach to machine learning based on trial-and-error searches-(Chapter 15), Multi-Agent Applications with Evolutionary Computational and Biologically Inspired Technologies Intelligent Techniques for Ubiquity and Optimization, Kambayashi, Y. (Ed.)2010
- Author(s)
  Kazuteru Miyazaki
- Total Pages
  267-293
- Publisher
  IGI Global
[Remarks]
- URL
  http://svrrd2.niad.ac.jp/faculty/teru/x of s. html

2012 Fiscal Year Final Research Report

Progressive research on the exploitation-oriented learning XoL

Principal Investigator

MIYAZAKI Kazuteru 独立行政法人大学評価・学位授与機構, 研究開発部, 准教授 (20282866)

Research Products

[Journal Article] リレー解説強化学習の最近の発展「第5回:応用志向の試行錯誤に基づく目的指向学習」Exploitation-oriented Learning;XoL2012

Author(s)

Journal Title

[Journal Article] Introduction of Fixed Mode States into Online Reinforcement Learning with Penalty and Reward and Its Application to Waist Trajectory Generation of Biped Robot2012

Author(s)

Journal Title

[Journal Article] Propocal of the Continuous-Valued Penalty Avoiding Rational Policy Making Algorithm2012

Author(s)

Journal Title

[Presentation] Proposal of an Exploitation-oriented Learning Method on Multiple Rewards and Penalties Environments2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 複数種類の報酬と罰に対応した経験強化型学習の提案と設計指針に関する研究2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Proposal and Evaluation of the Active Course Classification Support System with Exploitation-oriented Learning2011

Author(s)

Organizer

Place of Presentation

Year and Date

Author(s)

Total Pages

Publisher

[Remarks]

URL