2023 Fiscal Year Annual Research Report
Human-like continual robot learning based on three-level computational energy cost regulation
Project/Area Number |
22H03670
|
Allocation Type | Single-year Grants |
Research Institution | Osaka University |
Principal Investigator |
OZTOP Erhan 大阪大学, 先導的学際研究機構, 特任教授(常勤) (90542217)
|
Project Period (FY) |
2022-04-01 – 2025-03-31
|
Keywords | Lifelong Robot Learning / Learning Progress / Knowledge Transfer / Multitask Learning / Symbol Formation / Interleaved Learning |
Outline of Annual Research Achievements |
1) Different variations of multi-task learning model with bidirectional skill transfer is explored. One-way skill transfer from literature is generalized to bidirectional transfer, and how human-like learning can be realized via ‘interleaved learning’, for effective lifelong robot learning (LRL) is studied. In the current approaches, task order is specified, and tasks are learned to completion. Humans can switch tasks during learning and obtain skill transfer leverage. In LRL model, we realized such a mechanism and showed that it leads to effective learning. 2) A novel intrinsic motivation (IM) signal is proposed that combines computational energy cost (CEC) and learning progress (LP). Existing cognitive models disregards the cost of computation, yet the human brain must consider this. The work on CEC-aware task selection and network loss definition is tested on a new set of robotic tasks. The simulations has shown that a nice trade-off between learning accuracy and energy consumption is possible. 3) The LRL model is generalized to reinforcement learning (RL) domain. For doing so, a new LP signal is proposed, namely ‘expected total reward progress’, which is shown to facilitate effective learning when used as a signal for task selection in an interleaved manner. 4) Additional work on symbolic representation with attention layers is conducted. Also, interplay between CEC and robotic trust is explored with collaborators. Another direction explored is to consider prediction uncertainty as another IM signal that can be used by robots for lifelong learning.
|
Current Status of Research Progress |
Current Status of Research Progress
3: Progress in research has been slightly delayed.
Reason
The planned work items for second year and their current status assessment is as follows: Integration of Neural Computational Cost (NCC): NCC is integrated into Lifelong Robot Learning (LRL) for ‘task selection’ and ‘neural network loss computation’. It took time to find a good balance between learning progress and NCC. Overall the NCC work can be considered on time. Integration of Symbol/Concept based Knowledge transfer: Work in this direction is conducted with collaborators and theoretical results are obtained; however, the integration of this knowledge in LRL architecture is delayed. More Complex Multi-task Learning Scenarios: We have switched to more complex task scenarios but are still in the action-effect prediction domain. The arbitrariness in error definition of learning tasks makes it difficult to combine very different tasks in a single task arbitration mechanism. This still is an open problem, and work is being conducted on this. Overall work towards task complexification can be considered on time. Incorporation of Reinforcement Learning (RL) Tasks: This direction has been established and a paper is submitted so it is on time.
|
Strategy for Future Research Activity |
The third year of the project will focus on these research items: Integration of Symbol/Concept based Knowledge transfer :The existing know-how on symbol and concept formation will be integrated into the LRL model. Research will be conducted on how to represent knowledge in a resource economical way and how to efficiently access to that knowledge. Real Robot deployment: The perception and action primitives of the Torobo robot is tuned for the tasks used in the simulations. However, due to the complexities faced during modeling and task simulations, the real hardware experiments are left to the last year of the project. So, the goal is realize some of the simulated tasks on the real robot. Heterogeneous Multi-task Learning Scenarios: The current task scenarios are homogenous in that they are based on action-effect prediction learning. A new approach to address the arbitrariness in error definition of the tasks is needed. Effort will be spent on developing a heterogeneous multi-task learning framework with effective solutions.
|