Project/Area Number |
23K24926
|
Project/Area Number (Other) |
22H03670 (2022-2023)
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Multi-year Fund (2024) Single-year Grants (2022-2023) |
Section | 一般 |
Review Section |
Basic Section 61050:Intelligent robotics-related
|
Research Institution | Osaka University |
Principal Investigator |
OZTOP Erhan 大阪大学, 先導的学際研究機構, 特任教授(常勤) (90542217)
|
Project Period (FY) |
2022-04-01 – 2025-03-31
|
Project Status |
Granted (Fiscal Year 2024)
|
Budget Amount *help |
¥14,950,000 (Direct Cost: ¥11,500,000、Indirect Cost: ¥3,450,000)
Fiscal Year 2024: ¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000)
Fiscal Year 2023: ¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000)
Fiscal Year 2022: ¥6,240,000 (Direct Cost: ¥4,800,000、Indirect Cost: ¥1,440,000)
|
Keywords | 計算エネルギーコスト / 継続的な行動学習 / 概念形成 / 内発的動機 / スキル移転 / Lifelong Robot Learning / Learning Progress / Knowledge Transfer / Multitask Learning / Symbol Formation / Interleaved Learning |
Outline of Research at the Start |
生物にはエネルギーコストの制限があり,生涯を通じて効率的で有用な行動の学習と発達を可能にする.この制約は,ロボットにおいては,計算エネルギーコスト(CEC,computational energy cost)制約に対応すると考えられるが,ヒトと同様にこの制約がロボットの継続的な行動学習・発達にどのようにうまく機能するかが,本研究のテーマである.これを検証するために,ロボットの行動学習・発達機構として三層構造(エネルギーコスト最小化するニューラルネットワーク, CEC基づいた内発的動機, 概念形成)を想定する.各層でこの制約に基づく計算手法をロボットに実装され,ヒトに類似した学習行動が生成する.
|
Outline of Annual Research Achievements |
1) Different variations of multi-task learning model with bidirectional skill transfer is explored. One-way skill transfer from literature is generalized to bidirectional transfer, and how human-like learning can be realized via ‘interleaved learning’, for effective lifelong robot learning (LRL) is studied. In the current approaches, task order is specified, and tasks are learned to completion. Humans can switch tasks during learning and obtain skill transfer leverage. In LRL model, we realized such a mechanism and showed that it leads to effective learning. 2) A novel intrinsic motivation (IM) signal is proposed that combines computational energy cost (CEC) and learning progress (LP). Existing cognitive models disregards the cost of computation, yet the human brain must consider this. The work on CEC-aware task selection and network loss definition is tested on a new set of robotic tasks. The simulations has shown that a nice trade-off between learning accuracy and energy consumption is possible. 3) The LRL model is generalized to reinforcement learning (RL) domain. For doing so, a new LP signal is proposed, namely ‘expected total reward progress’, which is shown to facilitate effective learning when used as a signal for task selection in an interleaved manner. 4) Additional work on symbolic representation with attention layers is conducted. Also, interplay between CEC and robotic trust is explored with collaborators. Another direction explored is to consider prediction uncertainty as another IM signal that can be used by robots for lifelong learning.
|
Current Status of Research Progress |
Current Status of Research Progress
3: Progress in research has been slightly delayed.
Reason
The planned work items for second year and their current status assessment is as follows: Integration of Neural Computational Cost (NCC): NCC is integrated into Lifelong Robot Learning (LRL) for ‘task selection’ and ‘neural network loss computation’. It took time to find a good balance between learning progress and NCC. Overall the NCC work can be considered on time. Integration of Symbol/Concept based Knowledge transfer: Work in this direction is conducted with collaborators and theoretical results are obtained; however, the integration of this knowledge in LRL architecture is delayed. More Complex Multi-task Learning Scenarios: We have switched to more complex task scenarios but are still in the action-effect prediction domain. The arbitrariness in error definition of learning tasks makes it difficult to combine very different tasks in a single task arbitration mechanism. This still is an open problem, and work is being conducted on this. Overall work towards task complexification can be considered on time. Incorporation of Reinforcement Learning (RL) Tasks: This direction has been established and a paper is submitted so it is on time.
|
Strategy for Future Research Activity |
The third year of the project will focus on these research items: Integration of Symbol/Concept based Knowledge transfer :The existing know-how on symbol and concept formation will be integrated into the LRL model. Research will be conducted on how to represent knowledge in a resource economical way and how to efficiently access to that knowledge. Real Robot deployment: The perception and action primitives of the Torobo robot is tuned for the tasks used in the simulations. However, due to the complexities faced during modeling and task simulations, the real hardware experiments are left to the last year of the project. So, the goal is realize some of the simulated tasks on the real robot. Heterogeneous Multi-task Learning Scenarios: The current task scenarios are homogenous in that they are based on action-effect prediction learning. A new approach to address the arbitrariness in error definition of the tasks is needed. Effort will be spent on developing a heterogeneous multi-task learning framework with effective solutions.
|