2001 Fiscal Year Final Research Report Summary
A STUDY OF FEATURE (ATTRIBUTE) SELECTION IN DATA MINING
Project/Area Number |
12680398
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Tokyo Denki University |
Principal Investigator |
MANABU Ichino Tokyo Denki University, Department of Information and Arts, Professor., 理工学部, 教授 (40057245)
|
Project Period (FY) |
2000 – 2001
|
Keywords | pattern recognition / data mining / feature selection / neighborhood graph / feature evaluation / geometrical thickness / analysis of variance / Calhoun correlation coefficient |
Research Abstract |
The purpose of this research is to develop some methods of feature (attribute) selection in data mining. We report the results for feature selection in classification problems. Then, we report a new correlation coefficient which is applicable to various nonlinear relationships between feature variables. 1) Feature selection for classification problems When we have only a finite number of samples, the classification performance may not be improved by the addition of new features used to describe samples. This means that we have to strike a balance between the interclass distinguishability and the generality of class descriptions. We introduced two graphs: the generality or dered mutual neighborhood graph and the generality ordered interclass mutual neighborhood graph, then we dev eloped a feature selection algorithm based on the modified zero-one integer programmirig and it's simplified algorithm. 2) Generalized correlation coefficient Pearson's correlation coefficient is useful to detect causality between feature variables. However, this well known tool is not applicable to general nonlinear causal relations. If two feature variables follow to a functional structure, the sample distribution with respect to the feature variables has a geometrically thin structure. We developed a generalized correlation coefficient, called the Calhoun correlation coefficient. This new measure are able to evaluate various nonlinear functional relations and other geometrically this structures.
|
Research Products
(10 results)