2005 Fiscal Year Final Research Report Summary
Study of High-speed Data Mining Algorithms from Massive Data Streams
Project/Area Number |
15300036
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Media informatics/Database
|
Research Institution | KYUSHU UNIVERSITY (2003, 2005) Hokkaido University (2004) |
Principal Investigator |
IKEDA Daisuke Kyusyu Univ., Library, Asso.Prof., 附属図書館, 助教授 (00294992)
|
Co-Investigator(Kenkyū-buntansha) |
TAKEDA Masayuki Kyusyu Univ., Grad.School of Info.Sci. and Elec.Eng., Prof., 大学院・システム情報科学研究院, 教授 (50216909)
SHINOHARA Ayumi Tohoku Univ., Grad.School of Info.Sci., Prof., 大学院・情報科学研究科, 教授 (00226151)
KIDA Takuya Hokkaido Univ., Grad.School of Info.Sci., Asso.Prof., 大学院・情報科学研究科, 助教授 (70343316)
KASAHARA Yoshiaki Kyusyu Univ., Computing and Communication Center, Res.Assoc., 情報基盤センター, 助手 (60284577)
ISHINO Akira Kyusyu Univ., Office for Information of Univ.Evaluation, Res.Assoc., 大学評価情報室, 助手 (10315129)
|
Project Period (FY) |
2003 – 2005
|
Keywords | data stream / data mining / XML data / semi-structured data / pattern matching / sequence discovery / Xpath / tree mining |
Research Abstract |
In this research, we investigated high-speed online knowledge discovery system for extracting useful information from massive semi-structured data streams. Particularly in this year, as theoretical researches, we extended further the theory of efficient pattern matching and pattern discovery methods for online streams. As application studies, we made a series of experiments on collection and analysis of network data from real high-speed networks in a huge organization. We have also published the results obtained in the research period of the last three years. In particular, we proceed the studies on the following issues: (1)Survey on semi-structured data : We have summarized and published a survey on stream data mining in an academic journal, which has been studied through this project for the last three years. (2)Study on streaming pattern matching technology for semi-structured data : We developed an efficient method for performing tree pattern matching with horizontal wildcards by bit parallel technology, which potentially gives drastic speed-up for Xpath and XQuery pattern matching languages for huge XML data. (3)Study on sequential and streaming pattern discovery technology for semi-structured data : We developed efficient algorithms for finding interesting patterns from massive data streams for various classes of complex patterns/motifs. In this year, we also published pattern discovery algorithms developed in the last year. Also, one of them got awarded for 2004 JSAI SIG AWARD. (4)Empirical study on knowledge discovery from real massive network data : As applications, we performed a series of surveys on data collection and online analysis of high-speed large-scale network for middle sized organization at Kyushu University. These experiments will give insights for future research on the development of efficient pattern matching/discovery algorithms for high-speed streaming data.
|
Research Products
(18 results)