Budget Amount *help |
¥3,400,000 (Direct Cost: ¥3,400,000)
Fiscal Year 2002: ¥800,000 (Direct Cost: ¥800,000)
Fiscal Year 2001: ¥2,600,000 (Direct Cost: ¥2,600,000)
|
Research Abstract |
Previously, we proposed two learning algorithms, Labeling Q-learning(LQ-learning) and Switching Q-learning(SQ-learning). Although the former is the algorithm of simple structure which consists of a single agent, it can learn well in a certain kind of POMDP environments. The latter is a type of hierarchical Q-learning method (HQ-learning), which changes Q-modules by using a hierarchical learning automaton, and can work well also in a more complicated POMDP environment. In this study, we improved these two algorithms, and developed more effective HQ-learning algorithms. Further, in order to overcome more realistic environments where either or both of observations and actions take continuous values, we conducted a basic study about function approximations by neural networks. The results are following. 1) We improved the SQ-learning so that it works well in noisy environments. We also demonstrated that the SQ-learning exhibits a better performance than Wiering's HQ-learning. 2) We enhanced the performance of the LQ-leaning by introducing the Kohonen's self-organizing map(SOM). 3) We improved the self-segmentation of sequence(SSS) algorithm by Sun and Sessions. Further, we also developed a new algorithm, called SSS(λ). 4) We examined the effectiveness of SSS(λ) by applying it to the navigation task of a mobile robot. Here, the SOM was used for self-classification of continuous sonar observations. 5) We proposed a statistical approximation learning(SAL) for the simultaneous recurrent neural networks, and demonstrated that it achieves the high accuracy of nonlinear function approximation. Further, we presented a novel neural network model for incremental learning.
|