• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Pattern Discovery and Data Classification Based on String Compression

Research Project

Project/Area Number 22680014
Research Category

Grant-in-Aid for Young Scientists (A)

Allocation TypeSingle-year Grants
Research Field Intelligent informatics
Research InstitutionKyushu University

Principal Investigator

BANNAI Hideo  九州大学, システム情報研究院, 准教授 (20323644)

Project Period (FY) 2010 – 2012
Project Status Completed (Fiscal Year 2012)
Budget Amount *help
¥7,540,000 (Direct Cost: ¥5,800,000、Indirect Cost: ¥1,740,000)
Fiscal Year 2012: ¥2,340,000 (Direct Cost: ¥1,800,000、Indirect Cost: ¥540,000)
Fiscal Year 2011: ¥2,470,000 (Direct Cost: ¥1,900,000、Indirect Cost: ¥570,000)
Fiscal Year 2010: ¥2,730,000 (Direct Cost: ¥2,100,000、Indirect Cost: ¥630,000)
Keywords圧縮文字列処理 / パターン発見 / 直線的プログラム / 文字n-グラム / 文字列パターン発見 / 文字列データ分類 / q-gram
Research Abstract

Compressed string processing is an approach that aims to process a compressed representation of the string without explicitly decompressing it. In this study, we investigated the application of this approach to the problem of string pattern discovery and string data classification, and developed various efficient algorithms. Especially for the q-gram frequencies problem, we succeeded in developing a practically efficient algorithm that can be faster than directly processing the uncompressed text, showing, the effectiveness of the approach to the string pattern discovery and string classification problems

Report

(4 results)
  • 2012 Annual Research Report   Final Research Report ( PDF )
  • 2011 Annual Research Report
  • 2010 Annual Research Report
  • Research Products

    (38 results)

All 2013 2012 2011 2010

All Journal Article (19 results) (of which Peer Reviewed: 7 results) Presentation (19 results)

  • [Journal Article] Simpler and Faster Lempel Ziv Factorization, Proc.2013

    • Author(s)
      Keisuke Goto and Hideo Bannai
    • Journal Title

      Data Compression Conference 2013 (DCC 2013)

      Pages: 133-142

    • Related Report
      2012 Final Research Report
  • [Journal Article] From Run Length Encoding to LZ78 and Back Again2013

    • Author(s)
      Yuya Tamakoshi, Tomohiro I, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda
    • Journal Title

      Proc. Data Compression Conference 2013 (DCC 2013)

      Pages: 143-152

    • Related Report
      2012 Annual Research Report 2012 Final Research Report
  • [Journal Article] Computing convolution on grammar-compressed text2013

    • Author(s)
      Toshiya Tanaka, Tomohiro I, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda
    • Journal Title

      Proc. Data Compression Conference 2013 (DCC 2013)

      Pages: 451-460

    • Related Report
      2012 Annual Research Report 2012 Final Research Report
  • [Journal Article] Fast q-gram mining on SLP compressed strings2013

    • Author(s)
      Keisuke Goto, Hideo Bannai, Shunsuke Inenaga, and Masayuki Takeda
    • Journal Title

      Journal of Discrete Algorithms

      Volume: 18 Pages: 89-99

    • NAID

      120006654954

    • Related Report
      2012 Annual Research Report 2012 Final Research Report
  • [Journal Article] Simpler and Faster Lempel Ziv Factorization2013

    • Author(s)
      Keisuke Goto and Hideo Bannai
    • Journal Title

      Proc. Data Compression Conference 2013 (DCC 2013)

      Volume: DCC 2013 Pages: 133-142

    • DOI

      10.1109/dcc.2013.21

    • Related Report
      2012 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Efficient LZ78 factorization of grammar compressed text, Proc2012

    • Author(s)
      Hideo Bannai, Shunsuke Inenaga, and Masayuki Takeda
    • Journal Title

      19th International Symposium on String Processing and Information Retrieval (SPIRE 2012)

      Volume: 7608 Pages: 86-98

    • Related Report
      2012 Final Research Report
  • [Journal Article] An Efficient Algorithm to Test Square-Freeness of Strings Compressed by Straight-Line Programs2012

    • Author(s)
      Hideo Bannai, Travis Gagie, Tomohiro I, Shunsuke Inenaga, Gad M. Landau, and Moshe Lewenstein
    • Journal Title

      Information Processing Letters

      Volume: 112(19) Pages: 711-714

    • Related Report
      2012 Annual Research Report 2012 Final Research Report
  • [Journal Article] Speeding up q-gram mining on grammar-based compressed texts2012

    • Author(s)
      Keisuke Goto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
    • Journal Title

      Proc. 23rd Annual Symposium on Combinatorial Pattern Matching (CPM 2012)

      Volume: 7354 Pages: 220-231

    • Related Report
      2012 Final Research Report
  • [Journal Article] Computing q-gram Non-overlapping Frequencies on SLP Compressed Texts2012

    • Author(s)
      Keisuke Goto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
    • Journal Title

      Proc. 38th International Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM 2012)

      Volume: 7147 Pages: 301-312

    • Related Report
      2012 Final Research Report
  • [Journal Article] Finding Characteristic Substrings from Compressed Texts2012

    • Author(s)
      Shunsuke Inenaga, Hideo Bannai
    • Journal Title

      International Journal of Foundations of Computer Science

      Volume: 23(2) Pages: 261-280

    • Related Report
      2012 Final Research Report 2011 Annual Research Report
  • [Journal Article] Efficient LZ78 Factorization of Grammar Compressed Text2012

    • Author(s)
      Hideo Bannai
    • Journal Title

      SPIRE 2012

      Volume: - Pages: 86-98

    • DOI

      10.1007/978-3-642-34109-0_10

    • ISBN
      9783642341083, 9783642341090
    • Related Report
      2012 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Computing q-gram Non-overlapping Frequencies on SLP Compressed Texts2012

    • Author(s)
      Keisuke Goto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
    • Journal Title

      Proceedings of the 38th International Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM 2012)

      Volume: LNOS7147 Pages: 301-312

    • DOI

      10.1007/978-3-642-27660-6_25

    • ISBN
      9783642276590, 9783642276606
    • Related Report
      2011 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Speeding up q-gram mining on grammar-based compressed texts2012

    • Author(s)
      Keisuke Goto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
    • Journal Title

      Proceedings of the 23rd Annual Symposium on Combinatorial Pattern Matching (CPM 2012)

      Volume: (掲載決定)

    • Related Report
      2011 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Fast q-gram Mining on SLP Compressed Strings2011

    • Author(s)
      Keisuke Goto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
    • Journal Title

      Proc. 18th International Symposium on String Processing and Information Retrieval (SPIRE 2011)

      Volume: 7024 Pages: 278-289

    • NAID

      120006654954

    • Related Report
      2012 Final Research Report
  • [Journal Article] Faster Subsequence and Don't-Care Pattern Matching on Compressed Texts2011

    • Author(s)
      Takanori Yamamoto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
    • Journal Title

      Proc. 22nd Annual Symposium on Combinatorial Pattern Matching (CPM 2011)

      Volume: 6661 Pages: 309-322

    • NAID

      120006654962

    • Related Report
      2012 Final Research Report
  • [Journal Article] Fast q-gram Mining on SLP Compressed Strings2011

    • Author(s)
      Keisuke Goto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
    • Journal Title

      Proceedings of the 18th International Symposium on String Processing and Information Retrieval (SPIRE 2011)

      Volume: LNCS7024 Pages: 278-289

    • DOI

      10.1007/978-3-642-24583-1_27

    • NAID

      120006654954

    • ISBN
      9783642245824, 9783642245831
    • Related Report
      2011 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Faster Subsequence and Don't-Care Pattern Matching on Compressed Texts2011

    • Author(s)
      Takanori Yamamoto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
    • Journal Title

      Proceedings of the 22nd Annual Symposium on Combinatorial Pattern Matching (CPM 2011)

      Volume: (掲載確定)

    • NAID

      120006654962

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Sparse Substring Pattern Set Discovery using Linear Programming Boosting2010

    • Author(s)
      Kazuaki Kashihara, Kohei Hatano, Hideo Bannai, Masayuki Takeda
    • Journal Title

      Proc. 13th International Conference on Discovery Science (DS 2010)

      Volume: 6332 Pages: 132-143

    • Related Report
      2012 Final Research Report
  • [Journal Article] Sparse Substring Pattern Set Discovery using Linear Programming Boosting2010

    • Author(s)
      Kazuaki Kashihara, Kohei Hatano, Hideo Bannai, Masayuki Takeda
    • Journal Title

      Proceedings of the 13th International Conference on Discovery Science (DS 2010)

      Volume: LNAI 6332 Pages: 132-143

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Presentation] Simpler and Faster Lempel Ziv Factorization2013

    • Author(s)
      Keisuke Goto and Hideo Bannai
    • Organizer
      Data Compression Conference 2013 (DCC 2013)
    • Place of Presentation
      Snowbird, USA.
    • Related Report
      2012 Final Research Report
  • [Presentation] From Run Length Encoding to LZ78 and Back Again2013

    • Author(s)
      Yuya Tamakoshi, Tomohiro I, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda
    • Organizer
      Data Compression Conference 2013 (DCC 2013)
    • Place of Presentation
      Snowbird, USA.
    • Related Report
      2012 Final Research Report
  • [Presentation] Computing convolution on grammar-compressed text2013

    • Author(s)
      Toshiya Tanaka, Tomohiro I, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda
    • Organizer
      Data Compression Conference 2013 (DCC 2013)
    • Place of Presentation
      Snowbird, USA.
    • Related Report
      2012 Final Research Report
  • [Presentation] Computing convolution on grammar-compressed text2013

    • Author(s)
      Toshiya Tanaka, Tomohiro I, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda
    • Organizer
      Data Compression Conference 2013 (DCC 2013)
    • Place of Presentation
      Snowbird, Utah, USA
    • Related Report
      2012 Annual Research Report
  • [Presentation] From Run Length Encoding to LZ78 and Back Again2013

    • Author(s)
      Yuya Tamakoshi, Tomohiro I, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda
    • Organizer
      Data Compression Conference 2013 (DCC 2013)
    • Place of Presentation
      Snowbird, Utah, USA
    • Related Report
      2012 Annual Research Report
  • [Presentation] Simpler and Faster Lempel Ziv Factorization2013

    • Author(s)
      Keisuke Goto and Hideo Bannai
    • Organizer
      Data Compression Conference 2013 (DCC 2013)
    • Place of Presentation
      Snowbird, Utah, USA
    • Related Report
      2012 Annual Research Report
  • [Presentation] Improved q-gram Mining on SLP Compressed Strings2012

    • Author(s)
      Keisuke Goto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
    • Organizer
      London Stringology Days/London Algorithmic Workshop 2012 (LSD & LAW 2012)
    • Place of Presentation
      King's College London, London, United Kingdom
    • Year and Date
      2012-02-09
    • Related Report
      2011 Annual Research Report
  • [Presentation] Efficient LZ78 factorization of grammar compressed text2012

    • Author(s)
      Hideo Bannai, Shunsuke Inenaga, and Masayuki Takeda
    • Organizer
      19th International Symposium on String Processing and Information Retrieval (SPIRE 2012)
    • Place of Presentation
      Cartagena, Colombia
    • Related Report
      2012 Final Research Report
  • [Presentation] Speeding up q-gram mining on grammar-based compressed texts2012

    • Author(s)
      Keisuke Goto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
    • Organizer
      23rd Annual Symposium on Combinatorial Pattern Matching (CPM 2012)
    • Place of Presentation
      Helsinki, Finland
    • Related Report
      2012 Final Research Report
  • [Presentation] Computing q-gram Non-overlapping Frequencies on SLP Compressed Texts2012

    • Author(s)
      Keisuke Goto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
    • Organizer
      38th International Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM 2012)
    • Place of Presentation
      spindleruv Mlyn, Czech Republic.
    • Related Report
      2012 Final Research Report
  • [Presentation] Efficient LZ78 factorization of grammar compressed text,2012

    • Author(s)
      Hideo Bannai, Shunsuke Inenaga, and Masayuki Takeda
    • Organizer
      19th International Symposium on String Processing and Information Retrieval (SPIRE 2012)
    • Place of Presentation
      Cartagena, Colombia
    • Related Report
      2012 Annual Research Report
  • [Presentation] Fast q-gram Mining on SLP Compressed Strings2011

    • Author(s)
      Keisuke Goto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
    • Organizer
      Second Workshop on Algorithms for Large-Scale Information Processing in Knowledge Discovery (ALSIP 2011)
    • Place of Presentation
      サンポートホール高松(高松市)
    • Year and Date
      2011-12-01
    • Related Report
      2011 Annual Research Report
  • [Presentation] 圧縮テキスト上でのn-gram非重複頻度の効率的な計算とその応用2011

    • Author(s)
      後藤啓介, 坂内英夫, 稲永俊介, 竹田正幸
    • Organizer
      第134回アルゴリズム研究発表会
    • Place of Presentation
      琉球大学
    • Year and Date
      2011-03-07
    • Related Report
      2010 Annual Research Report
  • [Presentation] 圧縮テキスト上でのVLDCパターン照合問題2011

    • Author(s)
      山本卓典, 坂内英夫, 稲永俊介, 竹田正幸
    • Organizer
      第134回アルゴリズム研究発表会
    • Place of Presentation
      琉球大学
    • Year and Date
      2011-03-07
    • Related Report
      2010 Annual Research Report
  • [Presentation] 圧縮文字列上でのn-gram頻度の高速な計算方法2011

    • Author(s)
      後藤啓介, 坂内英夫, 稲永俊介, 竹田正幸
    • Organizer
      冬のLAシンポジウム2010
    • Place of Presentation
      京都大学
    • Year and Date
      2011-02-02
    • Related Report
      2010 Annual Research Report
  • [Presentation] 圧縮テキスト上での高速エピソードパターン照合2011

    • Author(s)
      山本卓典, 坂内英夫, 稲永俊介, 竹田正幸
    • Organizer
      冬のLAシンポジウム2010
    • Place of Presentation
      京都大学
    • Year and Date
      2011-02-02
    • Related Report
      2010 Annual Research Report
  • [Presentation] Fast q-gram Mining on SLP Compressed Strings2011

    • Author(s)
      Keisuke Goto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
    • Organizer
      18th International Symposium on String Processing and Information Retrieval (SPIRE 2011)
    • Place of Presentation
      Pisa, Italy.
    • Related Report
      2012 Final Research Report
  • [Presentation] Faster Subsequence and Don't-Care Pattern Matching on Compressed Texts2011

    • Author(s)
      Takanori Yamamoto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
    • Organizer
      22nd Annual Symposium on Combinatorial Pattern Matching (CPM 2011)
    • Place of Presentation
      Palermo, Italy.
    • Related Report
      2012 Final Research Report
  • [Presentation] Sparse Substring Pattern Set Discovery using Linear Programming Boosting2010

    • Author(s)
      Kazuaki Kashihara, Kohei Hatano, Hideo Bannai, Masayuki Takeda
    • Organizer
      13th International Conference on Discovery Science (DS 2010)
    • Place of Presentation
      Canberra, Australia.
    • Related Report
      2012 Final Research Report

URL: 

Published: 2010-08-23   Modified: 2019-07-29  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi