• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2001 Fiscal Year Final Research Report Summary

Integrated Machine Learning Workbench for Data Mining

Research Project

Project/Area Number 11694159
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field Intelligent informatics
Research InstitutionOsaka University

Principal Investigator

MOTODA Hiroshi  Institute of Scientific and Industrial Research, Osaka University Professor, 産業科学研究所, 教授 (00283804)

Co-Investigator(Kenkyū-buntansha) YOSHIDA Tetsuya  Institute of Scientific and Industrial Research, Osaka University Research Associate, 産業科学研究所, 助手 (80294164)
WASHIO Takashi  Institute of Scientific and Industrial Research, Osaka University Associate Professor, 産業科学研究所, 助教授 (00192815)
Project Period (FY) 1999 – 2001
KeywordsMachine Learning / Feature Selection / Feature Construction / Case Selection / Numerical Discretization / Knowledge Acquisition / Data Mining / International Collaboration
Research Abstract

A new generation of computational techniques and tools is required to support the extraction of useful knowledge from the rapidly growing volumes of data. In this research project we aimed to develop effective methods for feature selection, instance selection and feature construction and integrate them to form a basis of workbench for machine learning and data mining. For feature selection, various performance measures such as distance measure, uncertainty measure, dependency measure, consistency measure and error rate, and various search methods such as heuristic search, complete search and random search were investigated and a design strategy was proposed as to which method to use for which kind of dataset. Further, a new method ABB was proposed that uses consistency measure and performs a very efficient complete search. For instance selection, a new method S^3 Bagging which combines random subsampling and committee learning method was proposed and it was expected that this reduces the amount of data by 90%. For feature construction, two new methods were proposed. One is multi-strategy learning in which graph-base induction GBI that is based on repeated chunking of paired nodes was used as a feature constructor for use in decision tree classifier. Another is to construct new features from association rules. Both were tested against various datasets and conformed effective. All of these are components of the workbench, and we expect that this contributes to mining better knowledge more efficiently.

  • Research Products

    (20 results)

All Other

All Publications (20 results)

  • [Publications] 寺邊 正大: "S^3Baggingによる高速な分類 器生成"数理モデル化と応用. 42. 25-38 (2001)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Hiroshi Motoda: "Mining Patterns from Graph Structured Data"Proc. of the Fifth International Workshop on Multistrategy Leearning. 137-150 (2000)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Manoranjan Dash: "Consistency Based Feature Selection"Proc. of the 4th Pacific Asia Conference on Knowledge Discvoery and Data Mining. 98-109 (2000)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Bahua Gu: "Efficiently Determining the Starting Sample Size for Progressive Sampling""Proc. of the 12^<th> European Conference on Machine Learning. 192-202 (2001)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Manoranjan Dash: "Efficient Hierarchical Clustering Algorithms using Partially Overlapping Partitions"Proc. of the 5th Pacific Asia Conference on Knowledge Discvoery and Data Mining. 495-506 (2001)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Takashi Matsuda: "Graph-Based Induction for General Graph Structured Data and Its Application to Chemical Compound Data"Proc. of the Third International Conference on Discovery Science. 99-111 (2000)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] 鈴木 篤之: "システムの設計・運用・評価",岩波講座,現代工学の基礎(設計系V)"岩波書店. 165 (2002)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] M. Terabe: "Attribute Generation based on Association Rules"J. of JSAI. Vol.15. 187-197 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] T. Kayama: "Classification Rule Learning from Tree Structured Data by Stepwise Pair Expansion"J. of JSAI. Vol.15. 485-494 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] A. Inokuchi: "Fast and Complete Mining Method for Frequent Graph Patterns"J. of JSAI. Vol.15. 1052-1063 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] T. Wada: "The synthesis of Ripple Down Rules Method with an Inductive Learning using MDL Principle"J. JSAI. Vol.16. 268-278 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] T. Matsuda: "Graph-Based Induction for General Graphs and its Application"J. JSAI. Vol.16. 363-374 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Huan Liu: "Efficient Search of Reliable Exception"Proc. of the Third Pacific Asia Conference on Knowledge Discovery and Data Mining. 194-203 (1999)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] M. Terabe: "A Fast Classification by S3Bagging"J. IPSJ TOM. Vol.42. 25-38 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] H. Motoda: "Mining Patterns from Graph Structured Data"Proc. of the Fifth International Workshop on Multistrategy Leearning. 137-150 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Manoranjan Dash: "Consistency Based Feature Selection"Proc. of the 4th Pacific Asia Conference on Knowledge Discvoery and Data Mining. 98-109 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Bahua Gu: "Efficiently Determining the Starting Sample Size for Progressive Sampling"Proc. of the 12th European Conference on Machine Learning. 192-202 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Manoranjan Dash: "Efficient Hierarchical Clustering Algorithms using Partially Overlapping Partitions"Proc. of the 5tt Pacific Asia Conference on Knowledge Discvoery and Data Mining. 495-506 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] T. Matsuda: "Graph-Based Induction for General Graph Structured Data and Its Application to Chemical Compound Data"Proc. of the Third International Conference on discovery Science. 99-111 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Huan Liu: "Instance selection and construction for Data Mining"Kluwer Academic Publishers. (2001)

    • Description
      「研究成果報告書概要(欧文)」より

URL: 

Published: 2003-09-17  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi