A policy selection method based on the priming effect in the cognitive psychology for reinforcement learning agent
Project/Area Number |
16K12493
|
Research Category |
Grant-in-Aid for Challenging Exploratory Research
|
Allocation Type | Multi-year Fund |
Research Field |
Intelligent informatics
|
Research Institution | Tokyo Denki University |
Principal Investigator |
|
Co-Investigator(Kenkyū-buntansha) |
温 文 東京大学, 大学院工学系研究科(工学部), 特別研究員 (50646601)
河野 仁 東京工芸大学, 工学部, 助教 (70758367)
|
Project Period (FY) |
2016-04-01 – 2018-03-31
|
Project Status |
Completed (Fiscal Year 2017)
|
Budget Amount *help |
¥3,380,000 (Direct Cost: ¥2,600,000、Indirect Cost: ¥780,000)
Fiscal Year 2017: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Fiscal Year 2016: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
|
Keywords | 知識選択 / 活性化拡散モデル / 転移学習 / 強化学習 / マルチロボット転移学習 / 認知心理学 / 知的システムアーキテクチャ / 学習知識の選択 |
Outline of Final Research Achievements |
This research proposes a policy transfer method of a reinforcement learning agent for suitable learning in unknown or dynamic environments based on a spreading activation model in the cognitive psychology. The agent saves policies learned in various environments and learns flexibly by partially using suitable policy according to the environment. In the proposed method, an undirected graph is created between policies, and the network is constructed by them. The agent updates the activate value that policy has according to the environment while repeating processes of recall, activation, spreading, attenuation and learns based on the network. Agent uses this network in transfer learning. Experimental simulations comparing the proposed method with several existing methods are conducted to confirm the usefulness of the proposed method. Simulation results show that the agent achieves the task by selecting the optimal one from policies with the proposed method.
|
Report
(3 results)
Research Products
(6 results)