Budget Amount *help |
¥2,400,000 (Direct Cost: ¥2,400,000)
Fiscal Year 2006: ¥700,000 (Direct Cost: ¥700,000)
Fiscal Year 2005: ¥1,700,000 (Direct Cost: ¥1,700,000)
|
Research Abstract |
In machine learning, there are roughly two types of models, namely symbolic and non-symbolic. The former is comprehensible but not good at learning in changing environment. The latter, on the other hand, is good at learning but the learned results are not comprehensible. The purpose of this research is to propose a way that can learn and understand simultaneously, by combining neural network (NN) and decision tree (DT). The learning model used in this research is called the neural network tree (NNTree). The main problems in using NNTrees are that the induction cost is high and the results are not comprehensible. In the first year of this project, we proposed a heuristic method for defining the teacher signals for the data assigned to an internal node of the tree. Based on this method, NNs in the internal nodes can be trained using supervised learning, rather than evolutionary learning, as we did before. This can reduce the cost for induction greatly. To increase the comprehensibility of
… More
the NNTrees, we proposed to use a nearest neighbor classifier (NNC) in each internal node instead of an NN. We call this model the NNC-Tree. In fact, an NNC can provide very comprehensible decision rules if we consider each prototype as a "precedent". To design the NNCs in the internal nodes, we can use the R4-rule proposed by Zhao earlier. Experimental results show that the proposed method can induce accurate, compact, and comprehensible NNC-Trees. In the second year, we proposed two methods for reducing the induction cost. The first method is the "attentional learning method", and the second is the "dimensionality reduction method". In the first method, we pay more attention to difficult data and skip easy data during the R4-rule based learning. This can reduce more than 80% of the cost for NNC-Tree induction. In the second method, we try to reduce the dimensionality of the problem using principal component analysis. If the dimensionality of the original problem space is very high (e.g., image recognition), this method can also reduce the cost greatly. In the future, we would like to apply NNC-Trees to solve problems with incomplete data (data with missing attributes). We would also like to consider the importance and costs of the attributes during induction of the tree. Further, we would like to visualize the induction results, and try to make the NNC-Tree more comprehensible. Less
|