Project/Area Number |
20700126
|
Research Category |
Grant-in-Aid for Young Scientists (B)
|
Allocation Type | Single-year Grants |
Research Field |
Intelligent informatics
|
Research Institution | The University of Tokyo |
Principal Investigator |
MAKINO Takaki 東京大学, 生産技術研究所, 特任准教授 (20418651)
|
Project Period (FY) |
2008 – 2010
|
Project Status |
Completed (Fiscal Year 2010)
|
Budget Amount *help |
¥2,340,000 (Direct Cost: ¥1,800,000、Indirect Cost: ¥540,000)
Fiscal Year 2010: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2009: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2008: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
|
Keywords | 強化学習 / Restricted Collapsed Draws / ベイズ推論 / 徒弟学習 / 無限隠れマルコフモデル / クラスタリング / 中華料理店過程 / TD-Network / ノンパラメトリックベイズ / 逆強化学習 / 隠れマルコフモデル / 階層的クラスタリング / サンプリング法 / ベイズ推定 / 部分観測マルコフ決定過程 / 予測的状態表現 / エルマンネット |
Research Abstract |
This study focuses on environmental model reconstruction in reinforcement learning based on Bayesian inference techniques. In reinforcement learning, an agent learns environment model by trial-and-error; if we have a suitable Bayesian environment model that represents uncertainty in the environment, an optimal exploration can be achieved. For this purpose, we proposed new approaches that improve TD-network, an environment description framework based on predictive state representation. In addition, we extended a nonparametric Bayesian model for hidden Markov model to represent hierarchical clustering of hidden states. Moreover, we applied the framework of apprenticeship learning and proposed a method that constructs environment model from other’s actions based on Bayesian inference. These are elements that are required for Bayesian reconstruction of the process of environmental search and reconstruction.
|