• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2020 Fiscal Year Annual Research Report

文字列圧縮と組合せ論による大規模データ管理・処理技法の開発

Research Project

Project/Area Number 18F18120
Research InstitutionTokyo Medical and Dental University
Co-Investigator(Kenkyū-buntansha) Koeppl Dominik  東京医科歯科大学, M&Dデータ科学センター, 助教 (50897395)
Project Period (FY) 2018-10-12 – 2021-03-31
Keywordsdata structures / algorithms / lossless compression / hashing / アルゴリズム / データ構造 / 文字列データ処理 / tries
Outline of Annual Research Achievements

The focus of this research was set on (a) practical and dynamic trie data structures, (b) the computation of the grammar compression Re-Pair in small space, and (c) advancements for the bijective Burrows-Wheeler transform (BBWT), a variant of the Burrows-Wheeler transform (BWT) well received in theory as well as in practice for indexing string data.
(a) We have devised a novel approach for compact hashing, which is the most memory-efficient approach in practice when working with a huge number of integer keys of a bounded domain. Based on this approach, we have proposed dynamic trie data structures working with path-decomposition or with trie compaction.
(b) Re-Pair, a grammar with high compression ratios, is difficult to compute within limited amount of memory. Here, we could find a quadratic time algorithm computing Re-Pair with almost no additional space. We also devised an index data structure build upon a grammar representing the Lyndon tree. This index exploits several properties of the Lyndon words to improve the running time of the currently fastest grammar index from a quadratic factor on the pattern length to a linear one.
(c) Finally, we could build an indexing data structure on top of the BBWT, compute the BBWT in-place or transform the BWT into the BBWT, and finally build the BBWT in linear time.
Asides from that, we could find space-efficient factorization algorithms for the non-overlapping LZ77 factorization and the LZ78 substring compression problem. These algorithms work in near-linear time with space asymptotic to the input text length in bits.

Research Progress Status

令和2年度が最終年度であるため、記入しない。

Strategy for Future Research Activity

令和2年度が最終年度であるため、記入しない。

  • Research Products

    (25 results)

All 2021 2020 Other

All Int'l Joint Research (5 results) Journal Article (14 results) (of which Int'l Joint Research: 14 results,  Peer Reviewed: 14 results,  Open Access: 6 results) Presentation (5 results) (of which Int'l Joint Research: 5 results) Remarks (1 results)

  • [Int'l Joint Research] TU Dortmund/German Aerospace Center/Universitaet Stuttgart(ドイツ)

    • Country Name
      GERMANY
    • Counterpart Institution
      TU Dortmund/German Aerospace Center/Universitaet Stuttgart
    • # of Other Institutions
      1
  • [Int'l Joint Research] Dalhousie University(カナダ)

    • Country Name
      CANADA
    • Counterpart Institution
      Dalhousie University
  • [Int'l Joint Research] University of Leicester/Aberystwyth University(英国)

    • Country Name
      UNITED KINGDOM
    • Counterpart Institution
      University of Leicester/Aberystwyth University
  • [Int'l Joint Research] University of Helsinki(フィンランド)

    • Country Name
      FINLAND
    • Counterpart Institution
      University of Helsinki
  • [Int'l Joint Research] University of Chile(チリ)

    • Country Name
      CHILE
    • Counterpart Institution
      University of Chile
  • [Journal Article] Re-Pair in Small Space2021

    • Author(s)
      Dominik Koeppl and Tomohiro I and Isamu Furuya and Yoshimasa Takabatake and Kensuke Sakai and Keisuke Goto
    • Journal Title

      Algorithms

      Volume: 14(1) Pages: 1--20

    • DOI

      10.3390/a14010005

    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] PHONI: Streamed Matching Statistics with Multi-Genome References2021

    • Author(s)
      Christina Boucher and Travis Gagie and Tomohiro I and Dominik Koeppl and Ben Langmead and Giovanni Manzini and Gonzalo Navarro and Alejandro Pacheco and Massimiliano Rossi
    • Journal Title

      Proc. DCC

      Volume: - Pages: 193--202

    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Non-Overlapping LZ77 Factorization and LZ78 Substring Compression Queries with Suffix Trees2021

    • Author(s)
      Dominik Koeppl
    • Journal Title

      Algorithms

      Volume: 14(2) Pages: 1--21

    • DOI

      10.3390/a14020044

    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Fast and Simple Compact Hashing via Bucketing2020

    • Author(s)
      Dominik Koeppl and Simon J. Puglisi and Rajeev Raman
    • Journal Title

      Proc. SEA in LIPIcs

      Volume: 160 Pages: 7:1--7:14

    • DOI

      10.4230/LIPIcs.SEA.2020.7

    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Re-Pair in Small Space2020

    • Author(s)
      Dominik Koeppl and Tomohiro I and Isamu Furuya and Yoshimasa Takabatake and Kensuke Sakai and Keisuke Goto
    • Journal Title

      Proc. PSC

      Volume: - Pages: 134--147

    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Re-Pair in Small Space (Poster)2020

    • Author(s)
      Dominik Koeppl and Tomohiro I and Isamu Furuya and Yoshimasa Takabatake and Kensuke Sakai and Keisuke Goto
    • Journal Title

      Proc. DCC

      Volume: - Pages: 377

    • DOI

      10.1109/DCC47342.2020.00092

    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] c-Trie++: A Dynamic Trie Tailored for Fast Prefix Searches2020

    • Author(s)
      Kazuya Tsuruta and Dominik Koeppl and Shunsuke Kanda and Yuto Nakashima and Shunsuke Inenaga and Hideo Bannai and Masayuki Takeda
    • Journal Title

      Proc. DCC

      Volume: - Pages: 243--252

    • DOI

      10.1109/DCC47342.2020.00032

    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Computational Aspects of Ordered Integer Partition with Bounds2020

    • Author(s)
      Roland Glueck and Dominik Koeppl
    • Journal Title

      Algorithmica

      Volume: 82 Pages: 2955--2984

    • DOI

      10.1007/s00453-020-00713-7

    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] In-Place Bijective Burrows--Wheeler Transforms2020

    • Author(s)
      Dominik Koeppl and Daiki Hashimoto and Diptarama Hendrian and Ayumi Shinohara
    • Journal Title

      Proc. CPM in LIPIcs

      Volume: 161 Pages: 21:1--21:15

    • DOI

      10.4230/LIPIcs.CPM.2020.21

    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Deterministic Sparse Suffix Sorting in the Restore Model2020

    • Author(s)
      Johannes Fischer and Tomohiro I and Dominik Koeppl
    • Journal Title

      ACM Trans. Algorithms

      Volume: 16 Pages: 50:1--50:53

    • DOI

      10.1145/3398681

    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Grammar-compressed Self-index with Lyndon Words2020

    • Author(s)
      Kazuya Tsuruta and Dominik Koeppl and Yuto Nakashima and Shunsuke Inenaga and Hideo Bannai and Masayuki Takeda
    • Journal Title

      IPSJ TOM

      Volume: 13 Pages: 84--92

    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Dynamic Path-Decomposed Tries2020

    • Author(s)
      Shunsuke Kanda and Dominik Koeppl and Yasuo Tabei and Kazuhiro Morita and Masao Fuketa
    • Journal Title

      ACM JEA

      Volume: 25 Pages: 1.13:2--1.13:28

    • DOI

      10.1145/3418033

    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Space-efficient algorithms for computing minimal/shortest unique substrings2020

    • Author(s)
      Takuya Mieno and Dominik Koeppl and Yuto Nakashima and Shunsuke Inenaga and Hideo Bannai and Masayuki Takeda
    • Journal Title

      Theor. Comput. Sci.

      Volume: 845 Pages: 230--242

    • DOI

      10.1016/j.tcs.2020.09.017

    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] On Arithmetically Progressed Suffix Arrays2020

    • Author(s)
      Jacqueline W. Daykin and Dominik Koeppl and David Kuebel and Florian Stober
    • Journal Title

      Proc. PSC

      Volume: - Pages: 96--110

    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Presentation] PHONI: Streamed Matching Statistics with Multi-Genome References2021

    • Author(s)
      Dominik Koeppl
    • Organizer
      DCC
    • Int'l Joint Research
  • [Presentation] Fast and Simple Compact Hashing via Bucketing2020

    • Author(s)
      Dominik Koeppl
    • Organizer
      SEA
    • Int'l Joint Research
  • [Presentation] Re-Pair in Small Space2020

    • Author(s)
      Dominik Koeppl
    • Organizer
      PSC
    • Int'l Joint Research
  • [Presentation] c-Trie++: A Dynamic Trie Tailored for Fast Prefix Searches2020

    • Author(s)
      Kazuya Tsuruta and Dominik Koeppl
    • Organizer
      DCC
    • Int'l Joint Research
  • [Presentation] In-Place Bijective Burrows--Wheeler Transforms2020

    • Author(s)
      Dominik Koeppl
    • Organizer
      CPM
    • Int'l Joint Research
  • [Remarks] personal homepage

    • URL

      https://dkppl.de/

URL: 

Published: 2021-12-27  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi