2009 Fiscal Year Annual Research Report

帰納的強化学習の計算理論～環境の探索と帰納的再構成のベイズ推定

Research Project

Project/Area Number	20700126
Research Institution	The University of Tokyo
Principal Investigator	牧野貴樹 The University of Tokyo, 総括プロジェクト機構, 特任助教 (20418651)
Keywords	強化学習 / ノンパラメトリックベイズ / 隠れマルコフモデル / 階層的クラスタリング / 中華料理店過程 / サンプリング法
Research Abstract	今年度は、ノンパラメトリックベイズモデルを利用した強化学習研究のための拡張のステップとして2つの研究を行った。ひとつは、隠れマルコフモデル(HMM)における状態を階層的にクラスタリングする手法である。研究においては自然言語を対象にしてモデルの正当性を示したが、同様のモデルは、強化学習で扱うような複雑な環境をより効果的に学習するために有効な方法のひとつであると考えられる。もうひとつはHMMのノンパラメトリックベイズモデルのような、階層化された中華料理店過程からの複数の抽出の同時分布を適切に扱うためのサンプリング法の開発である。この方法により、より複雑なモデルを構築した場合でも適切なモデル推定が可能になることから、今後、ノンパラメトリックベイズ手法による強化学習を実現する際に有効であると考えられる。

Research Products
(6 results)

All 2010 2009

All Journal Article (2 results) (of which Peer Reviewed: 2 results) Presentation (4 results)

[Journal Article] Proto-predictive representation of states with simple recurrent temporal-difference networks2009
- Author(s)
  牧野貴樹
- Journal Title
  
  Proceedings of the 26th Annual international conference on machine learning 26
  
  Pages: 697-704
- Peer Reviewed
[Journal Article] コミュニケーションの自己組織化2009
- Author(s)
  牧野貴樹
- Journal Title
  
  自己組織化ハンドブック(NTS出版)
  
  Pages: 438-443
- Peer Reviewed
[Presentation] 隠れマルコフモデルのノンパラメトリックベイズ推定とMCMC法2010
- Author(s)
  牧野貴樹
- Organizer
  研究会『マルコフ連鎖モンテカルロ法とその周辺』
- Place of Presentation
  統計数理研究所(立川市)
- Year and Date
  2010-02-21
[Presentation] Conditional simultaneous draws from hierarchical chinese restaurant processes2009
- Author(s)
  Takaki Makino, Shunsuke Takei, Daichi Mochihashi, Issei Sato, Toshihisa Takagi
- Organizer
  Nonparametric Bayes Workshop at NIPS 2009(NPBayes 2009)
- Place of Presentation
  Whistler, BC, Canada
- Year and Date
  2009-12-11
[Presentation] ベイズ確率文脈自由文法のための高速構文木サンプリング法2009
- Author(s)
  武井俊祐, 牧野貴樹, 高木利久
- Organizer
  情報論的学習理論(IBIS)2009
- Place of Presentation
  九州大学(福岡市)
- Year and Date
  2009-10-19
[Presentation] 階層状態無限隠れマルコフモデル2009
- Author(s)
  牧野貴樹
- Organizer
  情報論的学習理論(IBIS)2009
- Place of Presentation
  九州大学(福岡市)
- Year and Date
  2009-10-19

2009 Fiscal Year Annual Research Report

帰納的強化学習の計算理論～環境の探索と帰納的再構成のベイズ推定

Principal Investigator

牧野 貴樹 The University of Tokyo, 総括プロジェクト機構, 特任助教 (20418651)

Research Products

[Journal Article] Proto-predictive representation of states with simple recurrent temporal-difference networks2009

Author(s)

Journal Title

[Journal Article] コミュニケーションの自己組織化2009

Author(s)

Journal Title

[Presentation] 隠れマルコフモデルのノンパラメトリックベイズ推定とMCMC法2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Conditional simultaneous draws from hierarchical chinese restaurant processes2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] ベイズ確率文脈自由文法のための高速構文木サンプリング法2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 階層状態無限隠れマルコフモデル2009

Author(s)

Organizer

Place of Presentation

Year and Date

牧野貴樹 The University of Tokyo, 総括プロジェクト機構, 特任助教 (20418651)