2019 Fiscal Year Final Research Report
Developing a theory of deep reinforcement learning equipped with bounded rationality
Project/Area Number |
17H04696
|
Research Category |
Grant-in-Aid for Young Scientists (A)
|
Allocation Type | Single-year Grants |
Research Field |
Soft computing
|
Research Institution | Tokyo Denki University |
Principal Investigator |
|
Project Period (FY) |
2017-04-01 – 2020-03-31
|
Keywords | 限定合理性 / 強化学習 / 満足化 / 社会学習 / 弱教示的学習 / 判定問題 / 仮説検証 / 試行錯誤 |
Outline of Final Research Achievements |
The real world agents such as human beings or animals learn and act in some (bounded) rational way toward their respective goals. The learning and acting are under severe restrictions as for perception, information processing, and actuation. In this project, we hypothesize that the efficient learning and acting are enabled by exploiting the search and decision-making policy called "satisficing" that was proposed as an alternative of optimization. We gave a new implementation (RS) of satisficing, establishing it as a useful algorithm, and we proved its efficiency. We applied RS to various tasks in reinforcement learning and showed its efficiency, including the most basic bandit problems and general tabular and non-tabular MDPs.
|
Free Research Field |
認知科学
|
Academic Significance and Societal Importance of the Research Achievements |
人間や動物の扱う、試行錯誤を伴う自律的な学習のロジックの重要な一端を明らかにした。特に、なぜ人間や動物が競争と「対抗模倣」により効率的なパフォーマンスの向上を見せるのかについて機械論的な説明を与えた。さらに、数学的に効率性を証明するとともに、様々な状況で効率性を示した。また、資本主義や市場の観点から、競争や対抗模倣の効率性と、表裏一体であるその危険性についても論じた。
|