Project/Area Number |
11694159
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Osaka University |
Principal Investigator |
MOTODA Hiroshi Institute of Scientific and Industrial Research, Osaka University Professor, 産業科学研究所, 教授 (00283804)
|
Co-Investigator(Kenkyū-buntansha) |
YOSHIDA Tetsuya Institute of Scientific and Industrial Research, Osaka University Research Associate, 産業科学研究所, 助手 (80294164)
WASHIO Takashi Institute of Scientific and Industrial Research, Osaka University Associate Professor, 産業科学研究所, 助教授 (00192815)
堀内 匡 大阪大学, 産業科学研究所, 助手 (50294129)
|
Project Period (FY) |
1999 – 2001
|
Project Status |
Completed (Fiscal Year 2001)
|
Budget Amount *help |
¥8,700,000 (Direct Cost: ¥8,700,000)
Fiscal Year 2001: ¥3,000,000 (Direct Cost: ¥3,000,000)
Fiscal Year 2000: ¥2,700,000 (Direct Cost: ¥2,700,000)
Fiscal Year 1999: ¥3,000,000 (Direct Cost: ¥3,000,000)
|
Keywords | Machine Learning / Feature Selection / Feature Construction / Case Selection / Numerical Discretization / Knowledge Acquisition / Data Mining / International Collaboration / MDL / AIC |
Research Abstract |
A new generation of computational techniques and tools is required to support the extraction of useful knowledge from the rapidly growing volumes of data. In this research project we aimed to develop effective methods for feature selection, instance selection and feature construction and integrate them to form a basis of workbench for machine learning and data mining. For feature selection, various performance measures such as distance measure, uncertainty measure, dependency measure, consistency measure and error rate, and various search methods such as heuristic search, complete search and random search were investigated and a design strategy was proposed as to which method to use for which kind of dataset. Further, a new method ABB was proposed that uses consistency measure and performs a very efficient complete search. For instance selection, a new method S^3 Bagging which combines random subsampling and committee learning method was proposed and it was expected that this reduces the amount of data by 90%. For feature construction, two new methods were proposed. One is multi-strategy learning in which graph-base induction GBI that is based on repeated chunking of paired nodes was used as a feature constructor for use in decision tree classifier. Another is to construct new features from association rules. Both were tested against various datasets and conformed effective. All of these are components of the workbench, and we expect that this contributes to mining better knowledge more efficiently.
|