Socially Aware Robot Navigation in Human-populated Scenario

Research Project

Project/Area Number	23KJ0580
Research Category	Grant-in-Aid for JSPS Fellows
Allocation Type	Multi-year Fund
Section	国内
Review Section	Basic Section 61050:Intelligent robotics-related
Research Institution	The University of Tokyo
Principal Investigator	呉家旭東京大学, 工学系研究科, 特別研究員(DC2)
Project Period (FY)	2023-04-25 – 2025-03-31
Project Status	Granted (Fiscal Year 2023)
Budget Amount *help	¥1,800,000 (Direct Cost: ¥1,800,000) Fiscal Year 2024: ¥900,000 (Direct Cost: ¥900,000) Fiscal Year 2023: ¥900,000 (Direct Cost: ¥900,000)
Keywords	robot learning / reinforcement learing / autonomous navigation / human-robot interaction
Outline of Research at the Start	This research focus on realizing a socially-aware navigation for mobile robot. A novel reinforcement learning framework is proposed which can train the mobile robot to efficiently navigate through crowds, and encourage the robot to politely cooperate with other pedestrians to avoid collision.
Outline of Annual Research Achievements	This year, a reinforcement learning (RL) framework using Deep Neural Networks (DNN) for socially aware navigation in human populated environment was established with 3 keypoints. For state representation, 2 alternatives include “Attention networks” and “Graph convolutional networks" and have been compared. For reward design, an entropy-based reward was newly proposed which aims to maximize choice of routes of surrounding pedestrians. The results in both simulations and real-world experiments demonstrated that the proposed reward induced socially aware navigation policy outperformed previous works. For sensitivity of risk, this research proposed an algorithm call LDM-based risk detector to early detect the tendency of falling into such areas of the RL navigation policy
Current Status of Research Progress	Current Status of Research Progress 1: Research has progressed more than it was originally planned. Reason Sim-to-real transfer is achieved in indoor navigation test. While the initial plan was to conduct real world experiments in 2024 , hardware design and system integration was completed early than the plan which make the test possible. The succeed and experience of indoor navigation test could be a strong foundation for larger scale outdoor experiences which is closer to the real world applications of this research. The progress of benchmark on pervious navigation methods is smoother than plan. This is important for evaluating the proposed method and discussion on various design alternatives. Though the initial plan was to train all the previous methods in 2024, over half of the benchmark target have been trained in 2023 thanks to autonomous train-test pipe line.
Strategy for Future Research Activity	To train a navigation policy that can adapt to given rules without tedious re-training, Successor Feature (SF) and Maximum Diffusion (MaxDiff) will be introduced to the current reinforcement learning framework. The MaxDiff will be used to explore different types of robot-crowd interaction and SF will be used to summarize those ways to a task conditioned policy. Furthermore, performance of the navigation policy trained by the novel approach will be tested in large scenarios with more complexity compared to last year. Navigation test inside campus with start-goal distance over 100 meters and crowd size over 30 people will be conducted.