• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Studies on fast pattern matching algorithms based on text compressions

Research Project

Project/Area Number 09680343
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field 計算機科学
Research InstitutionKYUSHU UNIVERSITY

Principal Investigator

TAKEDA Masayuki  Graduate School of Information Science and Electrical Engineering, KYUSHU UNIVERSITY Associate Professor, 大学院・システム情報科学研究科, 助教授 (50216909)

Co-Investigator(Kenkyū-buntansha) SHINOHARA Ayumi  Graduate School of Information Science and Electrical Engineering, KYUSHU UNIVER, 大学院・システム情報科学研究科, 助教授 (00226151)
Project Period (FY) 1997 – 1998
Project Status Completed (Fiscal Year 1998)
Budget Amount *help
¥700,000 (Direct Cost: ¥700,000)
Fiscal Year 1998: ¥700,000 (Direct Cost: ¥700,000)
Keywordspattern matching in compressed texts / speeding up pattern matching by text compression / multiple pattern matching / LZW compression / Huffman encoding / finite-state encoding / byte-pair encoding / パターン照合 / テキスト圧縮 / テキストデータベース / 情報検索 / データ圧縮
Research Abstract

The aim of text compressions is to decrease the amount for storing files in secondary disk stor- ages. Therefore the traditional criterion is the compression ratio. In this project we propose a new criterion to select a compression method. The criterion is the efficiency of string pattern matching in compressed texts without decoding. The goals of this project are :
Goal 1 : A faster search in compressed text in comparison with a decompression followed by a simple search.
Goal 2 : A faster search in compressed text in comparison with a simple search in uncompressed text.
Main results of this research in these two years are summarized as follows.
(1) We developed and implemented a multiple pattern matching algorithm in compressed text by the LZW compression method, which is used in the COMPRESS command in UNIX.
(2) We also devised a more efficient algorithm for a single pattern in LZW compressed texts, which is based on the Shift-And approach.
(3) We proved by experiments that the algorithms of (1) and (2) are approximately twice faster than a decompression followed by a simple search. That is, we have achieved Goal 1.
(4) We proved by experiments that the algorithms of (1) and (2) are faster than a simple search on uncompressed texts. That is, we have achieved Goal 2.
(5) We also developed compressed pattern matching algorithms for other compression methods, such as, byte pair encoding, Huffman encoding, finite-state encoding, and compression using antidictionaries, and then evaluate them. We have finished this project successfully.

Report

(3 results)
  • 1998 Annual Research Report   Final Research Report Summary
  • 1997 Annual Research Report
  • Research Products

    (18 results)

All Other

All Publications (18 results)

  • [Publications] Takeda,M.: "Pattern matching machine for text compressed using finite state model" Technical Report DOI-TR-142,Kyushu University. 1-12 (1997)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1998 Final Research Report Summary
  • [Publications] Kida,T.et al.: "Multiple Pattern Matching in LZW Compressed Text" Proc.Data Compression Conference,DCC'98. 103-112 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1998 Final Research Report Summary
  • [Publications] 宮崎正路ほか: "圧縮テキストに対するパターン照合機械の高速化" 情報処理学会論文誌. 39-9. 2638-2648 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1998 Final Research Report Summary
  • [Publications] Yamasaki,M.et al.: "Discovering characteristic patterns form collections of classical Japanese poems" Proc.1st International Conference on Discovery Science. 129-140 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1998 Final Research Report Summary
  • [Publications] Kida,T.et al.: "Shift-And approach to pattern matching in LZW compressed text" Technical Report DOI-TR-156,Kyushu University. 1-13 (1999)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1998 Final Research Report Summary
  • [Publications] Shibata,Y.et al.: "Pattern matching in text compressed by using antidictionaries" Technical Report DOI-TR-157,Kyushu University. 1-12 (1999)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      1998 Final Research Report Summary
  • [Publications] Takeda, M.: "Pattern matching machine for text compressed using finite state model" Technical Report DOI-TR-142, Kyushu University. 1-12 (1997)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1998 Final Research Report Summary
  • [Publications] Kida, T.et al.: "Multiple pattern match-ing in LZW compressed text" Proc.Data Compression Conference (DCC98). 103-112 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1998 Final Research Report Summary
  • [Publications] Miyazaki, M.et al.: "Speeding up the pat-tern matching machine for compressed texts" Transaction of Information Process-ing Society of Japan. Vol.39, No.9. 2638-2648 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1998 Final Research Report Summary
  • [Publications] Yamasaki, M.et al.: "Discovering charac-teristic patterns from collections of classical Japanese poems" Proc.1st International Conference on Discovery Science. 129-140 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1998 Final Research Report Summary
  • [Publications] Kida, T.et al.: "Shift-And approach to pattern matching in LZW compressed text" Technical Report DOI-TR-156, Kyushu University. 1-13 (1999)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1998 Final Research Report Summary
  • [Publications] Shibata, Y.et al.: "Pattern matching in text compressed by using antidictionaries" Technical Report DOI-TR-157, Kyushu University. 1-12 (1999)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      1998 Final Research Report Summary
  • [Publications] 宮崎正路: "圧縮テキストに対するパターン照合機械の高速化" 情報処理学会論文誌. 39・9. 2638-2648 (1998)

    • Related Report
      1998 Annual Research Report
  • [Publications] T. Kida et al: "Shift-And approach to pattern mutching in LZW texts" Technicul Report, Department of Informatics, Kyushu Univ.156. 1-12 (1999)

    • Related Report
      1998 Annual Research Report
  • [Publications] M. Yamasaki et al: "Discovering Characteristic Patterns from Collections of Classical Japanese Poems" Lecture Notes in Artificial Interlligence. 1532. 129-140 (1998)

    • Related Report
      1998 Annual Research Report
  • [Publications] Y. Shibata et al: "Pattern matching in texts Compressed by using Antidictionaries" Technicul Report, Department of Informatics, Kyushu Univ.157. 1-12 (1999)

    • Related Report
      1998 Annual Research Report
  • [Publications] T.Kida et al.: "Multiple Pattern Matching in LZW Compressed Text" Proc.Data Compression Conference,DCC'98. (to appear). (1998)

    • Related Report
      1997 Annual Research Report
  • [Publications] 山崎真由美ほか: "MDL原理を用いた和歌データからのパターン抽出" 情報処理学会研究報告. 37-5. 29-34 (1998)

    • Related Report
      1997 Annual Research Report

URL: 

Published: 1998-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi