Project/Area Number |
15K00344
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Research Field |
Soft computing
|
Research Institution | Osaka Prefecture University |
Principal Investigator |
Notsu Akira 大阪府立大学, 人間社会システム科学研究科, 准教授 (40405345)
|
Co-Investigator(Kenkyū-buntansha) |
本多 克宏 大阪府立大学, 工学(系)研究科(研究院), 教授 (80332964)
|
Project Period (FY) |
2015-04-01 – 2018-03-31
|
Project Status |
Completed (Fiscal Year 2017)
|
Budget Amount *help |
¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000)
Fiscal Year 2017: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2016: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
Fiscal Year 2015: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
|
Keywords | 強化学習 / 最適化問題 / 漸近最適戦略 / 自己組織化マップ / 意思決定 / クラスタリング / オンライン型 / 認知モデル |
Outline of Final Research Achievements |
In this subject, we have studied a method for stochastically optimal selection in reinforcement learning and optimization problems. When there are multiple choices, it is necessary to judge based on how much past experience and how much good results can be expected. In this research, we were able to devise several frameworks for introducing the optimal strategy while confirming that it is the same in reinforcement learning and optimization problems. In particular, from the viewpoint of Bayesian estimation, the reinforcement learning algorithm was fundamentally reviewed and the reconstruction showed that the conventional general idea of separating learning from decision making was wrong. In addition, we also gave research results on the method of estimating the state of the learner without applying computational load.
|