• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2023 Fiscal Year Research-status Report

Indexing Massive Datasets with Algorithmic Engineered Compression Techniques on Modern Computer Architectures

Research Project

Project/Area Number 21K17701
Research InstitutionUniversity of Yamanashi

Principal Investigator

Koeppl Dominik  山梨大学, 大学院総合研究部, 特任准教授 (50897395)

Project Period (FY) 2021-04-01 – 2025-03-31
Keywordscompressed indexes / string subsequences / NP-hard problems / straight line programs / collage systems / block trees / parameterized BWT / pattern matching
Outline of Annual Research Achievements

Following the research plan outlined for fiscal year 2023, our primary focus was on extending string regularities from substrings to subsequences, exploring NP-hard problems associated with strings, and refining compressed indexing data structures.
In the first thematic area, for computing the longest Lyndon subsequence, we achieved space and time bounds superior to those presented at IWOCA in 2022. Furthermore, we demonstrated methodologies for computing the longest bordered and periodic subsequences. This involved using novel tools to compute the longest common subsequences between all prefixes and suffixes of a text, which facilitated the computation of longest bordered or periodic subsequences. Asides, for the longest bordered subsequences, we established a conditional lower bound aligning with our quadratic running time.
Subsequently, we delved into studying common NP-hard problems with strings as inputs, leveraging answer set programming solvers. Additionally, we proved the NP-hardness of finding the smallest run-length compressed straight-line programs (RLSLPs) for unbounded alphabet sizes. We could adapt this proof to finding the smallest collage system. Additionally, we devised a MAX-SAT encoding for computing the smallest RLSLP.
In the final thematic area, we made advancements in the construction, practically for block trees and theoretically for the parameterized Burrows-Wheeler transform.
For the latter, we also demonstrated that this transform can be adapted for circular pattern matching by changing the encoding.

Current Status of Research Progress
Current Status of Research Progress

2: Research has progressed on the whole more than it was originally planned.

Reason

We conducted the research for the fiscal year 2023 as planned,
and could complete most of our planned research at the end of the grant lifespan in the fiscal year 2023.

Strategy for Future Research Activity

As the grant's term ended in fiscal year 2023, we are now in the process of preparing to apply for a new grant for fiscal year 2025, based on the fact that this research has unveiled new paths for further exploration within the realm of string regularities and compressed indexes, igniting our enthusiasm to pursue these paths in the forthcoming years.
While our main attention has been set to text indexing data structures for classic pattern matching, the exploration of extended pattern matching queries remains largely undone. In response, we aim to expand upon several concepts discovered during our recent research, combining them with cutting-edge indexing techniques tailored for classic pattern matching. We anticipate that these innovative indexing methodologies will find practical applications in scenarios where conventional pattern matching proves too restrictive, necessitating more adaptable matching criteria.

  • Research Products

    (16 results)

All 2024 2023 Other

All Int'l Joint Research (3 results) Journal Article (7 results) (of which Int'l Joint Research: 7 results,  Peer Reviewed: 7 results,  Open Access: 4 results) Presentation (5 results) (of which Int'l Joint Research: 1 results) Remarks (1 results)

  • [Int'l Joint Research] MPI Saarbruecken/Karlsruhe institute of technology/University of Muenster(ドイツ)

    • Country Name
      GERMANY
    • Counterpart Institution
      MPI Saarbruecken/Karlsruhe institute of technology/University of Muenster
  • [Int'l Joint Research] University of Helsinki(フィンランド)

    • Country Name
      FINLAND
    • Counterpart Institution
      University of Helsinki
  • [Int'l Joint Research] Nicolaus Copernicus University in Torun(ポーランド)

    • Country Name
      POLAND
    • Counterpart Institution
      Nicolaus Copernicus University in Torun
  • [Journal Article] Computing Longest Lyndon Subsequences and Longest Common Lyndon Subsequences2024

    • Author(s)
      Hideo Bannai and Tomohiro I and Tomasz Kociumaka and Dominik Koeppl and Simon J. Puglisi
    • Journal Title

      Algorithmica

      Volume: 86 Pages: 735-756

    • DOI

      10.1007/s00453-023-01125-z

    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Extending the Parameterized Burrows-Wheeler Transform2024

    • Author(s)
      Eric M. Osterkamp and Dominik Koeppl
    • Journal Title

      Proceedings of DCC

      Volume: - Pages: 143-152

    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] On the Hardness of Smallest RLSLPs and Collage Systems2024

    • Author(s)
      Akiyoshi Kawamoto and Tomohiro I and Dominik Koeppl and Hideo Bannai
    • Journal Title

      Proceedings of DCC

      Volume: - Pages: 243-252

    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Constructing and Indexing the Bijective and Extended Burrows-Wheeler Transform2024

    • Author(s)
      Hideo Bannai and Juha Kaerkkaeinen and Dominik Koeppl and Marcin Piatkowski
    • Journal Title

      Inf. Comput.

      Volume: 297 Pages: 1-30

    • DOI

      10.1016/j.ic.2024.105153

    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Encoding Hard String Problems with Answer Set Programming2023

    • Author(s)
      Dominik Koeppl
    • Journal Title

      Proceedings of CPM

      Volume: 259 Pages: 17:1-17:21

    • DOI

      10.4230/LIPIcs.CPM.2023.17

    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Longest bordered and periodic subsequences2023

    • Author(s)
      Hideo Bannai and Tomohiro I and Dominik Koeppl
    • Journal Title

      Inf. Process. Lett.

      Volume: 182 Pages: 1-6

    • DOI

      10.1016/j.ipl.2023.106398

    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Faster Block Tree Construction2023

    • Author(s)
      Dominik Koeppl and Florian Kurpicz and Daniel Meyer
    • Journal Title

      Proceedings of ESA

      Volume: 274 Pages: 74:1-74:20

    • DOI

      10.4230/LIPIcs.ESA.2023.74

    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Presentation] Answer Set Programming を用いた圧縮指標の計算2024

    • Author(s)
      クップル ドミニク and 番原 睦則
    • Organizer
      Local Proceedings of the LA Symposium Winter 2023
  • [Presentation] パラメタ化 Burrows-Wheeler 変換の拡張2023

    • Author(s)
      Eric M. Osterkamp and Dominik Koeppl
    • Organizer
      Local Proceedings of コンピュテーション研究会
  • [Presentation] lex-parse の圧縮感度2023

    • Author(s)
      中島 祐人 and クップル ドミニク and 舩越 満 and 稲永 俊介
    • Organizer
      Local Proceedings of the 195th アルゴリズム研究会
  • [Presentation] Encoding Hard String Problems with Answer Set Programming2023

    • Author(s)
      Dominik Koeppl
    • Organizer
      Sequences in London
    • Int'l Joint Research
  • [Presentation] ZDDを用いた最小文字列アトラクタの列挙2023

    • Author(s)
      藤岡 祐太 and 斎藤 寿樹 and クップル ドミニク
    • Organizer
      日本オペレーションズ・リサーチ学会 九州支部 九州地区におけるOR若手研究交流会
  • [Remarks] personal homepage

    • URL

      https://dkppl.de/

URL: 

Published: 2024-12-25  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi