2003 Fiscal Year Final Research Report Summary
A STUDY OF SYMBOLIC DATA ANALYSIS BASED ON NEIGHBORHOOD GRAPHS
Project/Area Number |
14580429
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | TOKYO DENKI UNIVERSITY |
Principal Investigator |
ICHINO Manabu Tokyo Denki Univ., Dept.of Inf & Arts, Professor, 理工学部, 教授 (40057245)
|
Project Period (FY) |
2002 – 2003
|
Keywords | symbolic data / discrimination / feature selection / neighborhood graph / pattern recognition / correlation analysis / geometrical thickness / generalized correlation coefficient |
Research Abstract |
The purpose of this research is to develop some methods for Symbolic Data Analysis (SDA).SDA is a new research field for generalized data table in which each sample is described by not only quantitative data but also qualitative data. We report research results : (1)a feature selection method in classification problems ; and (2)an approach to new correlation analysis. (1)Feature selection for classification problems When we have only a finite number of samples, the classification performance may not be improved by the addition of new features used to describe samples. This means that we have to strike a balance between the interclass distinguish-ability and the generality of class descriptions. For this purpose, we introduced two new neighborhood graphs called "the generality ordered mutual neighborhood graph" and "the generality ordered interclass mutual neighborhood graph". By using these new neighborhood graphs, we obtained a simple feature selection algorithm which strikes the balance described in the above. (2)Generalized correlation coefficient Pearson's correlation coefficient is useful to detect causality between feature variables. However, this well known tool is not applicable to general nonlinear causal relations. If two feature variables follow to a functional structure, the sample distribution with respect to the feature variables has a geometrically thin structure. From this viewpoint, we developed the Calhoun correlation coefficient for two features. We introduced a neighborhood graph called "generality ordered relative neighborhood graph" in order to treat geometrical thickness in three or more high dimensional feature spaces. As a basic result, we found that we can evaluate the geometrical thickness for many distributions like ropes in high dimensional feature spaces.
|
Research Products
(6 results)