Research for Efficient Algorithms of Large-Scale Database Analysis Based on Binary Decision Diagrams
Project/Area Number |
17300041
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Hokkaido University |
Principal Investigator |
MINATO Shin-Ichi Hokkaido University, Grad. School of IST, Associate Professor (10374612)
|
Co-Investigator(Kenkyū-buntansha) |
ZEUGMANN Thomaxs Hokkaido University, Grad. School of IST, Professor (60374609)
KIDA Takuya Hokkaido University, Grad. School of IST, Associate Prof. (70343316)
OKUBO Yoshiaki Hokkaido University, Grad. School of IST, Assistant Professor (40271639)
|
Project Period (FY) |
2005 – 2007
|
Project Status |
Completed (Fiscal Year 2007)
|
Budget Amount *help |
¥9,950,000 (Direct Cost: ¥9,200,000、Indirect Cost: ¥750,000)
Fiscal Year 2007: ¥3,250,000 (Direct Cost: ¥2,500,000、Indirect Cost: ¥750,000)
Fiscal Year 2006: ¥2,600,000 (Direct Cost: ¥2,600,000)
Fiscal Year 2005: ¥4,100,000 (Direct Cost: ¥4,100,000)
|
Keywords | Binary Decision Diagram / BDD / ZBDD / Data Mining / Database Analysis / Basic Software |
Research Abstract |
Binary Decision Diagrams (BDDs) are the efficient data structure for representing Boolean functions on the main memory. The techniques of BDD manipulation have been developed in the area of VLSI logic design since 1990's. Recently, we found that the BDD-based techniques can also be applied effectively to the problems of data mining and knowledge discovery. Especially, Zero-suppressed BDDs, a type of BDDs, are suitable for handling sets of sparse combinations that often appear in the real-life database analysis. In this research, we have developed efficient ZBDD-based techniques for large-scale database analysis, as follows. (1) We proposed a fast algorithm for generating very large-scale all/closed/maximal frequent itemsets, "LCM over ZBDDs." This algorithm is based on one of the most efficient state-of-the-art algorithms proposed thus far. Not only does it enumerate/list the itemsets, but it also generates a compact output data structure on the main memory. (2) We proposed a new method for discovering hidden information from large-scale transaction databases by considering a property of "cofactor implication." Cofactor implication is a generalization of "symmetric itemsets," which is well-known in VLSI CAD area. We developed an efficient algorithm of extracting all non-trivial item pairs with cofactor implication. (3) We presented VSOP program developed for calculating combinatorial itemsets specified by symbolic expressions. Based on ZBDD techniques, VSOP can efficiently handle large-scale sum-of-products expressions with a number of item symbols. VSOP supports not only Boolean set operations but also numerical arithmetic operations based on "Valued-Sum-Of-Products" algebra, such as addition, subtraction, multiplication, division, numerical comparison, etc. VSOP will facilitate research and development for various database analysis problems.
|
Report
(4 results)
Research Products
(40 results)