Data Mining Method for Multi-viewpoint and Multi-granularity Knowledge Discovery
Project/Area Number |
16300042
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Yokohama National University |
Principal Investigator |
SUZUKI Einoshin Yokohama National University, Faculty of Engineering, Associate Professor, 大学院・工学研究院, 助教授 (10251638)
|
Co-Investigator(Kenkyū-buntansha) |
ANDO Shin Yokohama National University, Faculty of Engineering, Assistant Professor, 大学院・工学研究院, 助手 (70401685)
|
Project Period (FY) |
2004 – 2005
|
Project Status |
Completed (Fiscal Year 2005)
|
Budget Amount *help |
¥14,300,000 (Direct Cost: ¥14,300,000)
Fiscal Year 2005: ¥5,400,000 (Direct Cost: ¥5,400,000)
Fiscal Year 2004: ¥8,900,000 (Direct Cost: ¥8,900,000)
|
Keywords | Multi-viewpoint and Multi-granularity Visualization / Web Page Data / Network Intrusion Data / Probabilistic Clustering / Transactional Data / Spatio-temporal Data / Data Mining / Information Visualization / 多視点・多粒度型知識発見 / シーケンスデータ |
Research Abstract |
We have invented a method which summarizes essential parts of data with probabilistic clustering and allocates hues based on information criteria as a data mining method for multi-viewpoint and multi-granularity knowledge discovery. This method is an extension of our PrototypeLines, of which effectiveness has been demonstrated with medical test data. We have investigated the effectiveness of the method with Web Page data, which represent a typical text and image data, and have exhibited that our method is superior to Google in terms of recall, precision, and computational time. The method has been improved and extended to the final method, of which effectiveness has been evaluated quantitatively by applying it to Web page data and network intrusion data. Experiments with Web page data were performed for a task of grasping the content of a large number of Web pages from a visualization result on a sheet of A4 paper. Because of the style of asking many questions in a limited period, we ha
… More
ve adopted the number of correct answers of the subjects as the evaluation index, and our method has succeeded to increase the value of the index by 35 % compared with Google. Though specific routines for images and keywords are necessary, we consider that we have accomplished the initial objective of visualizing information with appropriate viewpoints and granularities for knowledge discovery. For the experiments using network intrusion data, we have chosen prediction problems from access log to Web pages. Excellent results have been obtained in terms of recall and precision for malicious access detection, discovery of peculiar fraudulent access, and comprehensiveness of visualization results. In the process, we have developed a multi-objective search method, an information evaluation index, and clustering methods for predicate logic data and have confirmed their effectiveness. In addition, we have developed visualization methods for transactional data of itemsets in cooperation with the University of Caen in France and obtained excellent results. Applications to various statio-temporal data, of which soccer data is representative, have been pursued and excellent results have been obtained in both visualization and knowledge discovery. Less
|
Report
(3 results)
Research Products
(28 results)