2007 Fiscal Year Final Research Report Summary
Data mining technique from huge graph structured data which are lossless compressed
Grant-in-Aid for Scientific Research (C)
|Allocation Type||Single-year Grants |
|Research Institution||Hiroshima City University |
UCHIDA Tomoyuki Hiroshima City University -> 広島市立大学, Graduate School of Information Sciences -> 情報科学研究科, Associate Professor -> 准教授 (70264934)
SHOUDAI Takayoshi Kyushu University, Graduate School of Information Science and Electrical Engineering, Associate Professor (50226304)
MIYAHARA Tetsuhiro Hiroshima City University, Graduate School of Information Sciences, Associate Professor (90209932)
SUZUKI Yusuke Hiroshima City University, Graduate School of Information Sciences, Assistant Professor (10398464)
|Project Period (FY)
2005 – 2007
|Keywords||Algorithm / Data Mining / Machine Learning / Algorithmic Graph Theory|
Due to the rapid growth of Internet, many graph structured data such as Web documents, electric power wiring diagram and chemical compounds have become accessible on Internet. The purpose of this research is to present efficient graph mining algorithms for finding characteristic graph patterns from lossless compressed graph structured data. Then, we give results of this research as follows.
1. For tree structured data such as Web documents, we gave polynomial time learning algorithms on inductive inference and polynomial time learning algorithms in query learning model. Moreover, we presented tree mining algorithms for tree structured data.
2. In order to give graph mining techniques for graph structured data, by giving a polynomial time matching algorithm and a polynomial time algorithm for solving the minimal language problem for TTSP graph patterns, which is one of knowledge representations of an Electric power wiring diagram, we showed that the class of TTSP graphs is inductively inf
erable from positive data. In the query learning model, we showed that finite unions of TTSP graph patterns are polynomial time learnable from queries. Moreover, we presented a graph mining algorithm of finding characteristic graph patterns from a set of outerplanar graphs which is a data model of chemical compounds.
3. Based on Lempel-Zip compression for strings, we proposed a lossless compression algorithm for huge trees. Through several experiments, we showed that the proposed algorithms have good performance. Moreover, based on XBW transformations for trees given by Ferragina, et. al. in 2005, we presented an XBW transformation of lossless compressed trees. Then, we presented an efficient search algorithm of finding all occurrences of a given path on XBW structures of lossless compressed trees.
4. Based on an XBW transformation for huge lossless compressed trees, we proposed an XBW transformation for TTSP graphs. Moreover, we also presented an efficient search algorithm of finding all occurrences of a given path on XBW structures of TTSP graphs. Less
Research Products (12 results)