Project/Area Number |
23KJ0580
|
Research Category |
Grant-in-Aid for JSPS Fellows
|
Allocation Type | Multi-year Fund |
Section | 国内 |
Review Section |
Basic Section 61050:Intelligent robotics-related
|
Research Institution | The University of Tokyo |
Principal Investigator |
呉 家旭 東京大学, 工学系研究科, 特別研究員(DC2)
|
Project Period (FY) |
2023-04-25 – 2025-03-31
|
Project Status |
Granted (Fiscal Year 2023)
|
Budget Amount *help |
¥1,800,000 (Direct Cost: ¥1,800,000)
Fiscal Year 2024: ¥900,000 (Direct Cost: ¥900,000)
Fiscal Year 2023: ¥900,000 (Direct Cost: ¥900,000)
|
Keywords | robot learning / reinforcement learing / autonomous navigation / human-robot interaction |
Outline of Research at the Start |
This research focus on realizing a socially-aware navigation for mobile robot. A novel reinforcement learning framework is proposed which can train the mobile robot to efficiently navigate through crowds, and encourage the robot to politely cooperate with other pedestrians to avoid collision.
|
Outline of Annual Research Achievements |
This year, a reinforcement learning (RL) framework using Deep Neural Networks (DNN) for socially aware navigation in human populated environment was established with 3 keypoints. For state representation, 2 alternatives include “Attention networks” and “Graph convolutional networks" and have been compared. For reward design, an entropy-based reward was newly proposed which aims to maximize choice of routes of surrounding pedestrians. The results in both simulations and real-world experiments demonstrated that the proposed reward induced socially aware navigation policy outperformed previous works. For sensitivity of risk, this research proposed an algorithm call LDM-based risk detector to early detect the tendency of falling into such areas of the RL navigation policy
|
Current Status of Research Progress |
Current Status of Research Progress
1: Research has progressed more than it was originally planned.
Reason
Sim-to-real transfer is achieved in indoor navigation test. While the initial plan was to conduct real world experiments in 2024 , hardware design and system integration was completed early than the plan which make the test possible. The succeed and experience of indoor navigation test could be a strong foundation for larger scale outdoor experiences which is closer to the real world applications of this research. The progress of benchmark on pervious navigation methods is smoother than plan. This is important for evaluating the proposed method and discussion on various design alternatives. Though the initial plan was to train all the previous methods in 2024, over half of the benchmark target have been trained in 2023 thanks to autonomous train-test pipe line.
|
Strategy for Future Research Activity |
To train a navigation policy that can adapt to given rules without tedious re-training, Successor Feature (SF) and Maximum Diffusion (MaxDiff) will be introduced to the current reinforcement learning framework. The MaxDiff will be used to explore different types of robot-crowd interaction and SF will be used to summarize those ways to a task conditioned policy. Furthermore, performance of the navigation policy trained by the novel approach will be tested in large scenarios with more complexity compared to last year. Navigation test inside campus with start-goal distance over 100 meters and crowd size over 30 people will be conducted.
|