2007 Fiscal Year Final Research Report Summary
Structured Data Mining System which Considers Interactions of Structured Rules
Project/Area Number |
18300047
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Kyushu University |
Principal Investigator |
SUZUKI Einoshin Kyushu University, Faculty of Information System and Electrical Engineering, Professor (10251638)
|
Project Period (FY) |
2006 – 2007
|
Keywords | Structured rule / Action rule / Extended MDL Principle / Interestingness of patterns / Data Mining |
Research Abstract |
Structured rules, which are defined as mutually related rules, represent more related information th an a single rule and thus are expected to attract the interest of the user more frequently. In this project, we assume that the set of the structured rules which are candidates of the outcome of the discovery and the set of the structured rules given by the user mutually influence each other in a discovery process, and we have established a general data mining system which restricts interesting discovery outcomes by considering such mutual influence. Firstly, we have selected action rules each of which proposes changes of the values of actionable at tributes as the representation of the discovery outcome, invented a discovery method, and implemented it as a prototype system. This system is a general data mining method which discovers action rules which exhibit high achievability for changing a bad class into a good class from disk-resident massive data. The achievability of each action r
… More
ule is evaluated using the Naive Bayes classifier which is learnt from the sets of examples of both classes. We have demonstrated the effectiveness of the proposed method by experiments which employ data sets including the U. S. Census. Secondly, we have invented structured data mining which discovers a partial decision list which seems natural as structured knowledge based on information compression without specifying kinds of domain knowledge. We have realized this invention as an extended Minimum Description Length principle which helps to discover knowledge which explains a part of the example space by considering domain knowledge, and a search method for the principle. The search method tries three kinds of heuristic search methods and returns the hypothesis that has the shortest description length. We have implemented it as a prototype system and found, from its evaluation, many interesting results including high robustness against noise. We have also developed related data mining methods and search methods which may serve as bases of the proposed methods. Less
|
Research Products
(40 results)