Progressive research on the exploitation-oriented learning XoL

Research Project

Project/Area Number	22500143
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	National Institution for Academic Degrees and University Evaluation
Principal Investigator	MIYAZAKI Kazuteru 独立行政法人大学評価・学位授与機構, 研究開発部, 准教授 (20282866)
Project Period (FY)	2010 – 2012
Project Status	Completed (Fiscal Year 2012)
Budget Amount *help	¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000) Fiscal Year 2012: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000) Fiscal Year 2011: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000) Fiscal Year 2010: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Keywords	経験強化型学習 / 強化学習 / 報酬と罰の設計指針 / 機械学習 / 知能機械 / エージェント
Research Abstract	This research has completed an Exploitation-oriented Learning (XoL) method that can treat multiple rewards and penalties. Furthermore the design guideline of rewards and penalties on the XoL method has been proposed through illustrative examples, namely, a course classification task, a waist-trajectory learning task for a tendon-driven biped robot, and a Keepaway task in a multi-agent environment. It claim that XoL surpass traditional Reinforcement Learning based on Dynamic Programming in application to real-world problem.

Report

(4 results)

2012 Annual Research Report Final Research Report ( PDF )
2011 Annual Research Report
2010 Annual Research Report

Research Products
(38 results)

All 2013 2012 2011 2010 Other

All Journal Article (19 results) (of which Peer Reviewed: 11 results) Presentation (14 results) Book (2 results) Remarks (3 results)

[Journal Article] Proposal of an Exploitation-oriented Learning Method on Multiple Rewards and Penalties Environments and the Design Guideline2013
- Author(s)
  Kazuteru Miyazaki
- Journal Title
  
  Journal of Computers
  
  Volume: 印刷中
- Related Report
  2012 Annual Research Report
- Peer Reviewed
[Journal Article] リレー解説「強化学習の最近の発展」第５回：応用志向の「試行錯誤に基づく目的指向学習」Exploitation-oriented Learning; XoL2013
- Author(s)
  宮崎和光
- Journal Title
  
  計測と制御
  
  Volume: Vol.52, No.5
- Related Report
  2012 Annual Research Report
[Journal Article] マルチエージェント環境下における失敗確率伝播アルゴリズムEFPの有効性に関する研究2013
- Author(s)
  村岡宏紀, 宮崎和光, 小林博明
- Journal Title
  
  第40回知能システムシンポジウム資料
  
  Volume: なし Pages: 319-324
- Related Report
  2012 Annual Research Report
[Journal Article] リレー解説強化学習の最近の発展「第5回:応用志向の試行錯誤に基づく目的指向学習」Exploitation-oriented Learning;XoL2012
- Author(s)
  宮崎和光
- Journal Title
  
  計測と制御
  
  Volume: Vol.52, No.5 Pages: 462-467
- Related Report
  2012 Final Research Report
[Journal Article] Introduction of Fixed Mode States into Online Reinforcement Learning with Penalty and Reward and Its Application to Waist Trajectory Generation of Biped Robot2012
- Author(s)
  Seiya Kuroda, Kazuteru Miyazaki and Hiroaki Kobayashi
- Journal Title
  
  Journal of Advanced Computational Intelligence and Intelligent Informatics
  
  Volume: Vol.16, No.6 Pages: 758-768
- Related Report
  2012 Annual Research Report 2012 Final Research Report
- Peer Reviewed
[Journal Article] Propocal of the Continuous-Valued Penalty Avoiding Rational Policy Making Algorithm2012
- Author(s)
  Kazuteru Miyazaki
- Journal Title
  
  Journal of Advanced Computational Intelligence and Intelligent Informatics
  
  Volume: Vol.16, No.2 Pages: 183-190
- Related Report
  2012 Final Research Report
- Peer Reviewed
[Journal Article] Proposal and Evaluation of the Active Course Classification Support System with Exploitation-oriented Learning2012
- Author(s)
  Kazuteru Miyazaki and Masaaki Ida
- Journal Title
  
  Lecture Notes in Computer Science
  
  Volume: Vol.7188 Pages: 333-344
- DOI
  10.1007/978-3-642-29946-9_32
- ISBN
  9783642299452, 9783642299469
- Related Report
  2012 Annual Research Report
- Peer Reviewed
[Journal Article] Introduction of Fixed Mode States into Online Profit Sharing and Its Application to Waist Trajectory Generation of Biped Robot2012
- Author(s)
  Seiya Kuroda, Kazuteru Miyazaki and Hiroaki Kobayashi
- Journal Title
  
  Lecture Notes in Computer Science
  
  Volume: Vol.7188 Pages: 297-308
- DOI
  10.1007/978-3-642-29946-9_29
- ISBN
  9783642299452, 9783642299469
- Related Report
  2012 Annual Research Report
- Peer Reviewed
[Journal Article] Proposal of an Exploitation-oriented Learning Method on Multiple Rewards and Penalties Environments2012
- Author(s)
  Kazuteru Miyazaki
- Journal Title
  
  Proc. of the 2nd International Conference on Applied and Theoretical Information Systems Research (2nd ATIRSR)
  
  Volume: なし
- Related Report
  2012 Annual Research Report
- Peer Reviewed
[Journal Article] Proposal of an Active Course Classification Support System with Exploitation-oriented Learning Extended by Positive and Negative Examples2012
- Author(s)
  Kazuteru Miyazaki and Masaaki Ida
- Journal Title
  
  Proc. of the 6th International Conference on Soft Computing and Intelligent Systems and the 13th International Symposium on Advanced Intelligent Systems (SCIS-ISIS 2012)
  
  Volume: なし Pages: 1520-1527
- Related Report
  2012 Annual Research Report
- Peer Reviewed
[Journal Article] 複数種類の報酬と罰に対応した経験強化型学習の提案と設計指針に関する研究2012
- Author(s)
  宮崎和光
- Journal Title
  
  平成24年電気学会電子・情報・システム部門大会講演論文集
  
  Volume: なし Pages: 559-564
- Related Report
  2012 Annual Research Report
[Journal Article] Proposal of the Continuous-Valued Penalty Avoiding Rational Policy Making Algorithm2012
- Author(s)
  Miyazaki, K
- Journal Title
  
  Journal of Advanced Computational Intelligence and Intelligent Informatics
  
  Volume: Vol.16, No.2 Pages: 183-190
- Related Report
  2011 Annual Research Report
- Peer Reviewed
[Journal Article] 複数報酬環境下における意識的意思決定方法に関する研究2012
- Author(s)
  宮崎和光
- Journal Title
  
  第39回知能システムシンポジウム資料
  
  Pages: 95-98
- Related Report
  2011 Annual Research Report
[Journal Article] 正例および負例の集合を考慮した科目分類支援システムの提案と経験強化型学習との融合2011
- Author(s)
  宮崎和光, 井田正明
- Journal Title
  
  第21回インテリジェント・システム・シンポジウム講演原稿集
- NAID
  120005566631
- Related Report
  2011 Annual Research Report
[Journal Article] 経験強化型学習を利用した学位授与事業のための科目分類支援システムの提案2011
- Author(s)
  宮崎和光, 井田正明
- Journal Title
  
  第38回知能システムシンポジウム予稿集
  
  Pages: 123-128
- Related Report
  2010 Annual Research Report
[Journal Article] The Penalty Avoiding Rational Policy Making algorithm in Continuous Action Spaces2010
- Author(s)
  Miyazaki, K.
- Journal Title
  
  Proceedings of the 11th International Conference on Intelligent Data Engineering and Automated Learning
  
  Pages: 178-185
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] Threshold Learning in the Improved Penalty Avoiding Rational Policy Making Algorithm2010
- Author(s)
  Miyazaki, K., Kobayashi, J., Kobayashi, H.
- Journal Title
  
  Proceedings of the SICE Annual Conference 2010
  
  Pages: 3240-3245
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] Automatic Tuning of Judgement Parameter in Continuous State Exploitation-oriented Learning2010
- Author(s)
  Miyazaki, K.
- Journal Title
  
  Proceedings of the SICE Annual Conference 2010
  
  Pages: 3246-3249
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] マルチエージェント連続タスクへの改良型罰回避政策形成アルゴリズムの適用とサッカーロボットを用いた実験による評価2010
- Author(s)
  伊藤昌樹, 宮崎和光, 小林博明
- Journal Title
  
  第53回自動制御連合講演会論文集
  
  Pages: 4-4
- NAID
  130005025728
- Related Report
  2010 Annual Research Report
[Presentation] Proposal of an Exploitation-oriented Learning Method on Multiple Rewards and Penalties Environments2012
- Author(s)
  Kazuteru Miyazaki
- Organizer
  The 2nd International Conference on Applied and Theoretical Information Systems Research (2nd ATISR)
- Place of Presentation
  圓山大販店,台湾
- Year and Date
  2012-12-29
- Related Report
  2012 Final Research Report
[Presentation] 複数種類の報酬と罰に対応した経験強化型学習の提案と設計指針に関する研究2012
- Author(s)
  宮崎和光
- Organizer
  平成24年度電気学会電子・情報・システム部門大会
- Place of Presentation
  弘前大学
- Year and Date
  2012-09-07
- Related Report
  2012 Final Research Report
[Presentation] Proposal and Evaluation of the Active Course Classification Support System with Exploitation-oriented Learning2011
- Author(s)
  Kazuteru Miyazaki
- Organizer
  The 9th European Workshop on Reinforcement Learning (EWRL-9)
- Place of Presentation
  Athens Royal Olympic Hotel,ギリシャ
- Year and Date
  2011-09-11
- Related Report
  2012 Final Research Report
[Presentation] Proposal and Evaluation of the Active Course Classification Support System with Exploitation-oriented Learning2011
- Author(s)
  Miyazaki, K
- Organizer
  The 9th European Workshop on Reinforcement Learning (EWRL-9)
- Place of Presentation
  Athens Royal Olympic Hotel
- Year and Date
  2011-09-11
- Related Report
  2011 Annual Research Report
[Presentation] 正例および負例の集合を考慮した科目分類支援システムの提案と経験強化型学習との融合2011
- Author(s)
  宮崎和光
- Organizer
  第21回インテリジェント・システム・シンポジウム
- Place of Presentation
  神戸大学
- Year and Date
  2011-09-01
- Related Report
  2011 Annual Research Report
[Presentation] 経験強化型学習を利用した学位授与事業のための科目分類支援システムの提案2011
- Author(s)
  宮崎和光
- Organizer
  第38回知能システムシンポジウム
- Place of Presentation
  IS38wiki講演会(インターネット上)(大震災のため)
- Related Report
  2010 Annual Research Report
[Presentation] マルチエージェント連続タスクへの改良型罰回避政策形成アルゴリズムの適用とサッカーロボットを用いた実験による評価2010
- Author(s)
  伊藤昌樹
- Organizer
  第53回自動制御連合講演会
- Place of Presentation
  高知城ホール
- Year and Date
  2010-11-04
- Related Report
  2010 Annual Research Report
[Presentation] The Penalty Avoiding Rational Policy Making algorithm in Continuous Action Spaces2010
- Author(s)
  Miyazaki, K.
- Organizer
  11th International Conference on Intelligent Data Engineering and Automated Learning
- Place of Presentation
  University of the West of Scotland
- Year and Date
  2010-09-01
- Related Report
  2010 Annual Research Report
[Presentation] Threshold Learning in the Improved Penalty Avoiding Rational Policy Making Algorithm2010
- Author(s)
  Miyazaki, K.
- Organizer
  SICE Annual Conference 2010
- Place of Presentation
  Gland Hotel, Taipei, Taiwan
- Year and Date
  2010-08-21
- Related Report
  2010 Annual Research Report
[Presentation] Automatic Tuning of Judgement Parameter in Continuous State Exploitation-oriented Learning2010
- Author(s)
  Miyazaki, K.
- Organizer
  SICE Annual Conference 2010
- Place of Presentation
  Gland Hotel, Taipei, Taiwan
- Year and Date
  2010-08-21
- Related Report
  2010 Annual Research Report
[Presentation] Proposal of an Exploitation-oriented Learning Method on Multiple Rewards and Penalties Environments
- Author(s)
  Kazuteru Miyazaki
- Organizer
  The 2nd International Conference on Applied and Theoretical Information Systems Research (2nd ATIRSR)
- Place of Presentation
  圓山大飯店, 台北
- Related Report
  2012 Annual Research Report
[Presentation] Proposal of an Active Course Classification Support System with Exploitation-oriented Learning Extended by Positive and Negative Examples
- Author(s)
  Kazuteru Miyazaki
- Organizer
  The 6th International Conference on Soft Computing and Intelligent Systems and the 13th International Symposium on Advanced Intelligent Systems (SCIS-ISIS 2012)
- Place of Presentation
  神戸コンベンションセンター
- Related Report
  2012 Annual Research Report
[Presentation] マルチエージェント環境下における失敗確率伝播アルゴリズムEFPの有効性に関する研究
- Author(s)
  宮崎和光
- Organizer
  第40回知能システムシンポジウム
- Place of Presentation
  京都工芸繊維大学
- Related Report
  2012 Annual Research Report
[Presentation] 複数種類の報酬と罰に対応した経験強化型学習の提案と設計指針に関する研究
- Author(s)
  宮崎和光
- Organizer
  平成24年電気学会電子・情報・システム部門大会
- Place of Presentation
  弘前大学
- Related Report
  2012 Annual Research Report
[Book] Exploitation-oriented Learning XoL - A new approach to machine learning based on trial-and-error searches-(Chapter 15), Multi-Agent Applications with Evolutionary Computational and Biologically Inspired Technologies Intelligent Techniques for Ubiquity and Optimization, Kambayashi, Y. (Ed.)2010
- Author(s)
  Kazuteru Miyazaki
- Publisher
  IGI Global
- Related Report
  2012 Final Research Report
[Book] Exploitation-oriented Learning XoL-A new approach to machine learning based on trial-and-error searches-(Chapter 15)(Multi-Agent Applications with Evolutionary Computational and Biologically Inspired Technologies : Intelligent Techniques for Ubiquity and Optimization)(Kambayashi, Y.(Ed.))2010
- Author(s)
  Miyazaki, K.
- Publisher
  IGI Global
- Related Report
  2010 Annual Research Report
[Remarks]
- Related Report
  2012 Final Research Report
[Remarks]
- URL
  http://svrrd2.niad.ac.jp/faculty/teru/xol_s.html
- Related Report
  2011 Annual Research Report
[Remarks]
- URL
  http://svrrd2.niad.ac.jp/faculty/teru/xol_s.html
- Related Report
  2010 Annual Research Report

Progressive research on the exploitation-oriented learning XoL

Principal Investigator

MIYAZAKI Kazuteru 独立行政法人大学評価・学位授与機構, 研究開発部, 准教授 (20282866)

¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000)

Report

Research Products

[Journal Article] Proposal of an Exploitation-oriented Learning Method on Multiple Rewards and Penalties Environments and the Design Guideline2013

Author(s)

Journal Title

Related Report

[Journal Article] リレー解説「強化学習の最近の発展」第５回：応用志向の「試行錯誤に基づく目的指向学習」Exploitation-oriented Learning; XoL2013

Author(s)

Journal Title

Related Report

[Journal Article] マルチエージェント環境下における失敗確率伝播アルゴリズムEFPの有効性に関する研究2013

Author(s)

Journal Title

Related Report

[Journal Article] リレー解説強化学習の最近の発展「第5回:応用志向の試行錯誤に基づく目的指向学習」Exploitation-oriented Learning;XoL2012

Author(s)

Journal Title

Related Report

[Journal Article] Introduction of Fixed Mode States into Online Reinforcement Learning with Penalty and Reward and Its Application to Waist Trajectory Generation of Biped Robot2012

Author(s)

Journal Title

Related Report

[Journal Article] Propocal of the Continuous-Valued Penalty Avoiding Rational Policy Making Algorithm2012

Author(s)

Journal Title

Related Report

[Journal Article] Proposal and Evaluation of the Active Course Classification Support System with Exploitation-oriented Learning2012

Author(s)

Journal Title

DOI

ISBN

Related Report

[Journal Article] Introduction of Fixed Mode States into Online Profit Sharing and Its Application to Waist Trajectory Generation of Biped Robot2012

Author(s)

Journal Title

DOI

ISBN

Related Report

[Journal Article] Proposal of an Exploitation-oriented Learning Method on Multiple Rewards and Penalties Environments2012

Author(s)

Journal Title

Related Report

[Journal Article] Proposal of an Active Course Classification Support System with Exploitation-oriented Learning Extended by Positive and Negative Examples2012

Author(s)

Journal Title

Related Report

[Journal Article] 複数種類の報酬と罰に対応した経験強化型学習の提案と設計指針に関する研究2012

Author(s)

Journal Title

Related Report

[Journal Article] Proposal of the Continuous-Valued Penalty Avoiding Rational Policy Making Algorithm2012

Author(s)

Journal Title

Related Report

[Journal Article] 複数報酬環境下における意識的意思決定方法に関する研究2012

Author(s)

Journal Title

Related Report

[Journal Article] 正例および負例の集合を考慮した科目分類支援システムの提案と経験強化型学習との融合2011

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 経験強化型学習を利用した学位授与事業のための科目分類支援システムの提案2011

Author(s)

Journal Title

Related Report

[Journal Article] The Penalty Avoiding Rational Policy Making algorithm in Continuous Action Spaces2010

Author(s)

Journal Title

Related Report

[Journal Article] Threshold Learning in the Improved Penalty Avoiding Rational Policy Making Algorithm2010

Author(s)

Journal Title

Related Report

[Journal Article] Automatic Tuning of Judgement Parameter in Continuous State Exploitation-oriented Learning2010