2023 Fiscal Year Research-status Report
Socially Aware Robot Navigation in Human-populated Scenario
Project/Area Number |
23KJ0580
|
Research Institution | The University of Tokyo |
Principal Investigator |
呉 家旭 東京大学, 工学系研究科, 特別研究員(DC2)
|
Project Period (FY) |
2023-04-25 – 2025-03-31
|
Keywords | robot learning / reinforcement learing / autonomous navigation / human-robot interaction |
Outline of Annual Research Achievements |
This year, a reinforcement learning (RL) framework using Deep Neural Networks (DNN) for socially aware navigation in human populated environment was established with 3 keypoints. For state representation, 2 alternatives include “Attention networks” and “Graph convolutional networks" and have been compared. For reward design, an entropy-based reward was newly proposed which aims to maximize choice of routes of surrounding pedestrians. The results in both simulations and real-world experiments demonstrated that the proposed reward induced socially aware navigation policy outperformed previous works. For sensitivity of risk, this research proposed an algorithm call LDM-based risk detector to early detect the tendency of falling into such areas of the RL navigation policy
|
Current Status of Research Progress |
Current Status of Research Progress
1: Research has progressed more than it was originally planned.
Reason
Sim-to-real transfer is achieved in indoor navigation test. While the initial plan was to conduct real world experiments in 2024 , hardware design and system integration was completed early than the plan which make the test possible. The succeed and experience of indoor navigation test could be a strong foundation for larger scale outdoor experiences which is closer to the real world applications of this research. The progress of benchmark on pervious navigation methods is smoother than plan. This is important for evaluating the proposed method and discussion on various design alternatives. Though the initial plan was to train all the previous methods in 2024, over half of the benchmark target have been trained in 2023 thanks to autonomous train-test pipe line.
|
Strategy for Future Research Activity |
To train a navigation policy that can adapt to given rules without tedious re-training, Successor Feature (SF) and Maximum Diffusion (MaxDiff) will be introduced to the current reinforcement learning framework. The MaxDiff will be used to explore different types of robot-crowd interaction and SF will be used to summarize those ways to a task conditioned policy. Furthermore, performance of the navigation policy trained by the novel approach will be tested in large scenarios with more complexity compared to last year. Navigation test inside campus with start-goal distance over 100 meters and crowd size over 30 people will be conducted.
|