Co-Investigator(Kenkyū-buntansha) |
SHIMOZONO Shinichi KYUSHU UNIVERSITY, Department of Artificial Intelligence, Ass. Prof., 情報工学部, 助教授 (70243988)
SAKAMOTO Hiroshi KYUSHU UNIVERSITY, Department of Informatics, Res. Ass., 大学院・システム情報科学研究院, 助手 (50315123)
TAKEDA Masayuki KYUSHU UNIVERSITY, Department of Informatics, Ass. Prof., 大学院・システム情報科学研究院, 助教授 (50216909)
|
Research Abstract |
From a theoretical point of view on compressed pattern matching, we introduced a unified frame work, called Collage System, for various dictionary-based data compression methods. We developed both Knuth-Morris-Pratt type and Boyer-Moore type pattern matching algorithms for Collage Systems. We adopted these algorithms for Byte-Pair-Encoding compression method, that yields the fastest compressed pattern matching algorithm in practice. Multiple pattern matching and approximate string matching were also successfully dealt with Collage Systems. We also applied the method for Sequitur, that is another hopeful a compression program, and verified its performance. Moreover, we studied an efficient fully compressed pattern matching for balanced straight-line programs, where not only text strings but also pattern strings are compressed. We also developed an online algorithm that constructs a subsequence automaton from given set of strings, that accepts all subsequences of any string in the set. The algorithm is the fastest, and we verified that it is quite useful to accelerate a knowledge discovery system. On the other hand, concerning with knowledge discovery from database, we studied on the learnability of transformation rules of trees from examples, and searching optimal association rules of words from large text databases. Journal of Discrete Algorithms, 1(1), 2000
|