2022 Fiscal Year Annual Research Report
Human-like continual robot learning based on three-level computational energy cost regulation
Project/Area Number |
22H03670
|
Allocation Type | Single-year Grants |
Research Institution | Osaka University |
Principal Investigator |
OZTOP Erhan 大阪大学, 先導的学際研究機構, 特任教授(常勤) (90542217)
|
Project Period (FY) |
2022-04-01 – 2025-03-31
|
Keywords | Lifelong Robot Learning / Learning Progress / Knowledge Transfer / Multitask Learning / Symbol Formation |
Outline of Annual Research Achievements |
In the first year of the project an appropriate robot simulator environment is selected (Pybullet) and the software platform for Lifelong Robot Learning (LRL) model has been developed on it. A robotic arm with three tasks (T1,T2,T3) is considered for LRL. The robot action is modeled as hitting objects with different angles. LRL tasks are set as the prediction of the effects of the actions in the three different environments, free space (T1), wall with changing orientation (T2) , L-shaped wall with changing orientation (T3). Task execution is based on Learning Progress (LP) whereas ‘neural cost’ consideration is left for next year. A basic knowledge transfer architecture is developed among the neural networks of each task. The symbol formation component is also explored but not incorporated into the simulated LRL model. Parallel to the development of the LRL model, supporting work is conducted and several publications are produced, and a workshop in IROS 2022 is held together with collaborators. In one line of research, work on symbol formation by the use of discrete units in the latent layers of deep neural networks is studied [2]. In addition, a work on robotic trust is conducted with collaborators which uses ‘neural computational cost’ for forming trust in social partners [3]. Therefore the LRL model can be extended to include trust formation, even though it was not directly part of the initial proposal. In addition, for supporting human-robot related tasks some work is devoted to teaching robots how to correct errors based on human demonstration.
|
Current Status of Research Progress |
Current Status of Research Progress
2: Research has progressed on the whole more than it was originally planned.
Reason
The current work is focused on (1) building the main software architecture to test the developed ideas, (2) developing new neural architectures for symbol formation and their usable deployment in Lifelong Robot Learning (LRL), (3) building neural architectures to enable knowledge transfer without a priori knowledge of task execution order (4) developing methods for Intrinsic Motivation (IM) based on Learning Progress (LP) and Neural Cost of Computation (NCC) (5) improving learning speed by using Active Learning. The last research focus was not explicitly given in the project proposal. As the robots need to learn by themselves it is important to accelerate learning by choosing the actions cleverly. In the research in the first year, we noted two less explored areas in the literature that this project can contribute to. The first one is the connection between Learning Progress (LP) and Neural Cost of Computation (NCC) and Active Learning. The second one is to address how to realize multi-task learning with unknown partial learning regime such as T1, T2, T1, T3, T2,T1, T3 … in a, for example, 3-task learning scenario. Overall, the project is progressing well in the scientific investigations of (2-4). The infrastructure building task of (1) has been delayed for finding a fast simulator that is sufficiently open and reliable for multi-task learning. Some considerable time is spent on making the simulator work in long-time scales with tasks exchanging without restarting the simulator and task learning.
|
Strategy for Future Research Activity |
The second year of the project will focus on these research items:
Integration of Neural Computational Cost (NCC): NCC will be integrated into Lifelong Robot Learning (LRL) for (1) task selection and (2) neural network learning by the introduction of explicit regularization aiming for low cost computation/task learning. The current status is at a stage where NCC can be addressed early in the summer of 2023. Integration of Symbol/Concept based Knowledge transfer:The existing know-how on symbol and concept formation will be integrated into the LRL model. In particular, research is needed on how to represent knowledge in a resource economical way and for developing algorithms for efficient access to that knowledge. To give an example, when a new task is engaged how the robot cognitive system should know which piece of information is more likely to be useful in solving the task? More Complex Multi-task Learning Scenarios: The range of the simulation will be extended to include tasks which are far more different from each other. In particular, tasks that require different loss functions must be considered for a general LRL architecture. Incorporation of Reinforcement Learning (RL) Tasks: An important set of tasks for a robot is its learning ability based on rewards. In the first year, only supervised learning tasks have been considered. In the second year, this will be extended to cover RL tasks with knowledge transfer when task sequence is unspecified and partial learning is allowed as in the spirit of developmental learning and LRL architecture aimed.
|