Research Abstract |
As non-parametric methods of classifying based on decision tree, CART and C4. 5 are typical. In this research, an import of such a non-parametric approach based on decision tree into canonical discriminant analysis is discussed. We call the decision tree based approach which has been imported into canonical discriminant analysis CDAT (Canonical Discriminant Analysis Tree). In CDAT, the index of improvement by dividing a region into two is defined as the difference between before-division sum of eigenvalues for between-groups variance-covariance matrix against within-groups variance-covariance matrix and after-division one. By using programming language C on work station, a routine of CDAT was implemented. For the sake of variety of handwritings, 3 writers prepared learning and testing data sets which consisted of 5 kinds of handwritten characters selected out of 109 cursive kana letters respectively, by using X-window system and programming language C on work staton. Then, CDAT was compared with canonical discriminant analysis with the above data sets, by using cross-validation. As a result, in the case that there was a variety of handwritings between writers, CDAT surpassed canonical discriminant analysis, where a threshold was needed so that subregion divided by decision tree might not become too small. In nonlinear structure of data sets, CDAT is effective to pile strata of decision trees recursively.
|