2012 年度研究成果報告書

経験強化型学習XoLに関する発展的研究

研究課題

研究課題/領域番号	22500143
研究種目	基盤研究(C)
配分区分	補助金
応募区分	一般
研究分野	知能情報学
研究機関	独立行政法人大学評価・学位授与機構
研究代表者	宮崎和光独立行政法人大学評価・学位授与機構, 研究開発部, 准教授 (20282866)
研究期間 (年度)	2010 – 2012
キーワード	経験強化型学習 / 強化学習 / 報酬と罰の設計指針
研究概要	得られた経験を強く強化する機械学習手法である「経験強化型学習XoL」の発展として、「複数種類の報酬と罰を扱える手法」を完成させるとともに、応用の際に特に重要となる「報酬と罰の設計指針」の提示に成功した。具体的な応用例として、「科目の分類を支援する実システム」、「2足歩行ロボットの腰軌道学習」および「Keepawayタスクと呼ばれるサッカーを模したゲーム問題」への適用を行った。これらの成果により、伝統的な強化学習手法に対するXoLの優位性を強く主張できたと考える。

研究成果

(8件)

すべて 2012 2011 2010 その他

すべて雑誌論文 (3件) (うち査読あり 2件) 学会発表 (3件) 図書 (1件) 備考 (1件)

[雑誌論文] リレー解説強化学習の最近の発展「第5回:応用志向の試行錯誤に基づく目的指向学習」Exploitation-oriented Learning;XoL2012
- 著者名/発表者名
  宮崎和光
- 雑誌名
  
  計測と制御
  
  巻: Vol.52, No.5 ページ: 462-467
[雑誌論文] Introduction of Fixed Mode States into Online Reinforcement Learning with Penalty and Reward and Its Application to Waist Trajectory Generation of Biped Robot2012
- 著者名/発表者名
  Seiya Kuroda, Kazuteru Miyazaki and Hiroaki Kobayashi
- 雑誌名
  
  Journal of Advanced Computational Intelligence and Intelligent Informatics
  
  巻: Vol.16, No.6 ページ: 758-768
- 査読あり
[雑誌論文] Propocal of the Continuous-Valued Penalty Avoiding Rational Policy Making Algorithm2012
- 著者名/発表者名
  Kazuteru Miyazaki
- 雑誌名
  
  Journal of Advanced Computational Intelligence and Intelligent Informatics
  
  巻: Vol.16, No.2 ページ: 183-190
- 査読あり
[学会発表] Proposal of an Exploitation-oriented Learning Method on Multiple Rewards and Penalties Environments2012
- 著者名/発表者名
  Kazuteru Miyazaki
- 学会等名
  The 2nd International Conference on Applied and Theoretical Information Systems Research (2nd ATISR)
- 発表場所
  圓山大販店,台湾
- 年月日
  2012-12-29
[学会発表] 複数種類の報酬と罰に対応した経験強化型学習の提案と設計指針に関する研究2012
- 著者名/発表者名
  宮崎和光
- 学会等名
  平成24年度電気学会電子・情報・システム部門大会
- 発表場所
  弘前大学
- 年月日
  2012-09-07
[学会発表] Proposal and Evaluation of the Active Course Classification Support System with Exploitation-oriented Learning2011
- 著者名/発表者名
  Kazuteru Miyazaki
- 学会等名
  The 9th European Workshop on Reinforcement Learning (EWRL-9)
- 発表場所
  Athens Royal Olympic Hotel,ギリシャ
- 年月日
  2011-09-11
[図書] Exploitation-oriented Learning XoL - A new approach to machine learning based on trial-and-error searches-(Chapter 15), Multi-Agent Applications with Evolutionary Computational and Biologically Inspired Technologies Intelligent Techniques for Ubiquity and Optimization, Kambayashi, Y. (Ed.)2010
- 著者名/発表者名
  Kazuteru Miyazaki
- 総ページ数
  267-293
- 出版者
  IGI Global
[備考]
- URL
  http://svrrd2.niad.ac.jp/faculty/teru/x of s. html

2012 年度 研究成果報告書

経験強化型学習XoLに関する発展的研究

研究代表者

宮崎 和光 独立行政法人大学評価・学位授与機構, 研究開発部, 准教授 (20282866)

研究成果

[雑誌論文] リレー解説強化学習の最近の発展「第5回:応用志向の試行錯誤に基づく目的指向学習」Exploitation-oriented Learning;XoL2012

著者名/発表者名

雑誌名

[雑誌論文] Introduction of Fixed Mode States into Online Reinforcement Learning with Penalty and Reward and Its Application to Waist Trajectory Generation of Biped Robot2012

著者名/発表者名

雑誌名

[雑誌論文] Propocal of the Continuous-Valued Penalty Avoiding Rational Policy Making Algorithm2012

著者名/発表者名

雑誌名

[学会発表] Proposal of an Exploitation-oriented Learning Method on Multiple Rewards and Penalties Environments2012

著者名/発表者名

学会等名

発表場所

年月日

[学会発表] 複数種類の報酬と罰に対応した経験強化型学習の提案と設計指針に関する研究2012

著者名/発表者名

学会等名

発表場所

年月日

[学会発表] Proposal and Evaluation of the Active Course Classification Support System with Exploitation-oriented Learning2011

著者名/発表者名

学会等名

発表場所

年月日

著者名/発表者名

総ページ数

出版者

[備考]

URL

2012 年度研究成果報告書

宮崎和光独立行政法人大学評価・学位授与機構, 研究開発部, 准教授 (20282866)