Budget Amount *help |
¥23,660,000 (Direct Cost: ¥18,200,000、Indirect Cost: ¥5,460,000)
Fiscal Year 2019: ¥6,630,000 (Direct Cost: ¥5,100,000、Indirect Cost: ¥1,530,000)
Fiscal Year 2018: ¥6,370,000 (Direct Cost: ¥4,900,000、Indirect Cost: ¥1,470,000)
Fiscal Year 2017: ¥10,660,000 (Direct Cost: ¥8,200,000、Indirect Cost: ¥2,460,000)
|
Outline of Final Research Achievements |
The real world agents such as human beings or animals learn and act in some (bounded) rational way toward their respective goals. The learning and acting are under severe restrictions as for perception, information processing, and actuation. In this project, we hypothesize that the efficient learning and acting are enabled by exploiting the search and decision-making policy called "satisficing" that was proposed as an alternative of optimization. We gave a new implementation (RS) of satisficing, establishing it as a useful algorithm, and we proved its efficiency. We applied RS to various tasks in reinforcement learning and showed its efficiency, including the most basic bandit problems and general tabular and non-tabular MDPs.
|