統計的学習に基づく強化学習に関する研究

Research Project

Project/Area Number	20700208
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Single-year Grants
Research Field	Sensitivity informatics/Soft computing
Research Institution	Kyoto University
Principal Investigator	森健京都大学, 情報学研究科, 研究員 (00457144)
Project Period (FY)	2008 – 2009
Project Status	Completed (Fiscal Year 2010)
Budget Amount *help	¥2,990,000 (Direct Cost: ¥2,300,000、Indirect Cost: ¥690,000) Fiscal Year 2010: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000) Fiscal Year 2009: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000) Fiscal Year 2008: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
Keywords	強化学習 / 統計的学習
Research Abstract	多くの強化学習法では、ある状態である行動を取ることの将来的な良さを表す「価値関数」を近似する必要がある。最も広く行われている方法は、価値関数をパラメータと基底関数の内積で表現する線形関数近似を行う方法である。基底関数は設計者の試行錯誤により得られる。自動的に基底関数を構築する方法もあるが、非常に大きな計算コストが掛かる。我々は、価値関数の近似誤差を逐次的に減少させる近似法を提案しており、本年度は主にこの業績化に取り組んだ。この方法は、設計者の事前の試行錯誤を必要とせず、また、計算コストも小さくて済む。基本的なアルゴリズムを国際会議論文として業績化し、それをロバストに改良したアルゴリズムについても国際会議論文として業績化した。アルゴリズムの性質を理論面および実験面においてより深め学術論文誌へ投稿したがまだ採録に至っていない。アルゴリズム全2体の統計的な性質をクリアにすることで、さらなる業績化が可能と考えている。また、これまでに考案してきた統計的学習に基づく種々の強化学習アルゴリズムを、本科研費で購入した実機ロボットへ適用し学習を試みた。具体的には、レゴマインドストームを用いて二輪型ロボットを作製し、そのバランシングを新たな強化学習法を用いて行った。二輪型ロボットのバランシングを自動調整することは、自転車やバイクにおける個々人の運転の快適性を向上させることに貢献し、さらには事故率の低減にも繋がると考えている。

Report

(2 results)

2009 Annual Research Report
2008 Annual Research Report

Research Products
(7 results)

All 2009 2008 Other

All Presentation (5 results) Remarks (2 results)

[Presentation] Robust approximation in decomposed reinforcement learning2009
- Author(s)
  Takeshi Mori
- Organizer
  International Conference on Neural Information Processing
- Place of Presentation
  Bangkok, Thailand
- Year and Date
  2009-12-04
- Related Report
  2009 Annual Research Report
[Presentation] An additive reinforcement learning2009
- Author(s)
  Takeshi Mori
- Organizer
  International Conference on Artificial Neural Networks
- Place of Presentation
  Limasol, Cyprus
- Year and Date
  2009-09-14
- Related Report
  2009 Annual Research Report
[Presentation] A continuous internal-state controller for partially observable Markov decision processes2008
- Author(s)
  Yuki Taniguchi
- Organizer
  International Conference on Artificial Neural Networks
- Place of Presentation
  Prague, Czech Republic
- Year and Date
  2008-09-04
- Related Report
  2008 Annual Research Report
[Presentation] Self-organized reinforcement learning based on policy gradient in nonstationary environment2008
- Author(s)
  Yu Hiei
- Organizer
  International Conference on Artificial Neural Networks
- Place of Presentation
  Prague, Czech Republic
- Year and Date
  2008-09-03
- Related Report
  2008 Annual Research Report
[Presentation] A semiparametric statistical approach to model-free policy evaluation2008
- Author(s)
  Tsuyoshi Ueno
- Organizer
  International Conference on Machine Learning
- Place of Presentation
  Helsinki, Finland
- Year and Date
  2008-07-06
- Related Report
  2008 Annual Research Report
[Remarks]
- URL
  http://hawaii.sys.i.kyoto-u.ac.jp/~tak-mori/
- Related Report
  2009 Annual Research Report
[Remarks]
- URL
  http://hawaii.sys.i.kyoto-u.ac.jp/~tak-mori/
- Related Report
  2008 Annual Research Report

統計的学習に基づく強化学習に関する研究

Principal Investigator

森 健 京都大学, 情報学研究科, 研究員 (00457144)

¥2,990,000 (Direct Cost: ¥2,300,000、Indirect Cost: ¥690,000)

Report

Research Products

[Presentation] Robust approximation in decomposed reinforcement learning2009

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] An additive reinforcement learning2009

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] A continuous internal-state controller for partially observable Markov decision processes2008

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Self-organized reinforcement learning based on policy gradient in nonstationary environment2008

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] A semiparametric statistical approach to model-free policy evaluation2008

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Remarks]

URL

Related Report

[Remarks]

URL

Related Report

森健京都大学, 情報学研究科, 研究員 (00457144)