DEVELOPMENT AND EVALUATION OF AGENTS THAT CAN ADAPTIVELY LEARN COOPERATIVE TACTICS
Project/Area Number |
12680369
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | SAITAMA INSTITUTE OF TECHNOLOGY (2001) The University of Tokyo (2000) |
Principal Investigator |
NAGANO Saburo SAITAMA INSTITUTE OF TECHNOLOGY, DEPARTMENT OF INFORMATIONAL SOCIETY STUDIES PROFESSOR, 先端科学研究所, 教授 (50010913)
|
Co-Investigator(Kenkyū-buntansha) |
UEDA Kazuhiro THE UNIVERSITY OF TOKYO, INTERFACULTY INITIATIVE IN INFORMATION SUDIES ASSOCIATE PROFESSOR, 大学院・情報学環・学際情報学府, 助教授 (60262101)
|
Project Period (FY) |
2000 – 2001
|
Project Status |
Completed (Fiscal Year 2001)
|
Budget Amount *help |
¥3,700,000 (Direct Cost: ¥3,700,000)
Fiscal Year 2001: ¥1,200,000 (Direct Cost: ¥1,200,000)
Fiscal Year 2000: ¥2,500,000 (Direct Cost: ¥2,500,000)
|
Keywords | Multi Agents / Meachine Learning / Concurrent Learning / Soccer Agents / Reinforced Leaning / Robocup / Probabilistic Estimation / 適応学習 / 協調 / 戦術 / ベイズ推定 |
Research Abstract |
This research proposes soccer agents that can learn cooperative tactics. On the way actual soccer coaches help novice players to learn soccer, we proposed a mechanism of learning to distinguish good tactics from bad ones. The agents adaptively learn the utilities of the situations and of the conditional probabilities about the situations, to predict the opponents' behavior and to select good actions. In the discrete grid field of 3 by 4, in which 3 attackers and 2 defenders participated, the agents were observed to cooperatively create some combinations of passes, such as a wall pass and a one-two pass : This is considered to be enabled only by mutual behavioral prediction. Besides, in order to evaluate the learning method of our agents, we built the agents which tackled the same task by the Q-learning algorithm. In case of using the Q-learning algorithm, the variance of the winning percentages of the attacking team was quite big. This means that the Q-learning algorithm is not appropr
… More
iate for the learning of soccer agents where the concurrent learning problem is crucial. On the contrary, the learning curves of our learning agents were stable, which means that our method is robust to the side effect of the concurrent learning. We also provided the mechanisms for our agents to participate in a full game of the RoboCup simulator league. We built the agents which have a subjective grid-view, in order to adapt to more global situation, and which can make a decision the same as the above-mentioned agents. These agents were implemented with state variables which were extracted on the basis of concentric circular grids that were relatively attached around each agent. Perception of the field situations of each grid was reduced to the information about the difference in the number between the allies and the opponents and about the spatial accessibility. These agents showed better performance than non-learning agents such as those in YowAI and CMUnited, in the team play on a "11 vs. 11" game. Less
|
Report
(3 results)
Research Products
(9 results)