• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Deepening BWT for massive data processing

Research Project

Project/Area Number 19K20213
Research Category

Grant-in-Aid for Early-Career Scientists

Allocation TypeMulti-year Fund
Review Section Basic Section 60010:Theory of informatics-related
Research InstitutionKyushu Institute of Technology

Principal Investigator

I Tomohiro  九州工業大学, 大学院情報工学研究院, 准教授 (20773360)

Project Period (FY) 2019-04-01 – 2024-03-31
Project Status Completed (Fiscal Year 2023)
Budget Amount *help
¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000)
Fiscal Year 2021: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2020: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2019: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
Keywords文字列処理 / BW変換 / 圧縮文字列処理 / 圧縮索引 / 一般化文字列照合 / 圧縮変換 / データ圧縮 / 圧縮情報処理 / 文法圧縮 / Burrows-Wheeler変換
Outline of Research at the Start

Burrows-Wheeler変換(BW変換)は1994年に圧縮のためのデータ変換手法として提案された.BW変換は,後にデータ処理において様々な利点を有していることが判明し,近年も多くの重要な発見がされている.本研究では,BW変換をデータ処理に適した表現への変換技術と位置付け,その根底にあるアイデアを徹底的に追求することで,大規模データ解析の基盤技術を開発する.

Outline of Final Research Achievements

The Burrows-Wheeler Transform (BWT) of a string is obtained by sorting each character in the string with its subsequent suffix, which has been used for data compression and compressed data processing. In this project we obtained the following results: (1) We simplified the index based on Run-length BWT (RLBWT) and improved its throughput for direct construction. (2) We proposed a practical algorithm for converting RLBWT to LZ77. (3) We proposed a BWT-based index for palindrome pattern matching. (4) We proposed an efficient algorithm to construct BWT-based indexes for parameterized pattern matching.

Academic Significance and Societal Importance of the Research Achievements

データ処理において,データをどのように表現するかは処理の効率に大きく関わる最重要かつ根源的な問題である.圧縮のためのデータ変換手法として提案されたBurrows-Wheeler変換(BW変換)は,後の研究によりデータ処理において様々な利点を有していることが明らかになっている.本研究は,BW変換文字列を連長圧縮した領域で動作するアルゴリズムや一般化文字列照合におけるBW変換の応用技術の発展に寄与した.

Report

(6 results)
  • 2023 Annual Research Report   Final Research Report ( PDF )
  • 2022 Research-status Report
  • 2021 Research-status Report
  • 2020 Research-status Report
  • 2019 Research-status Report
  • Research Products

    (26 results)

All 2024 2023 2022 2021 2020 2019

All Journal Article (16 results) (of which Int'l Joint Research: 9 results,  Peer Reviewed: 16 results,  Open Access: 5 results) Presentation (10 results) (of which Int'l Joint Research: 10 results)

  • [Journal Article] Computing Longest Lyndon Subsequences and Longest Common Lyndon Subsequences2024

    • Author(s)
      Hideo Bannai and Tomohiro I and Tomasz Kociumaka and Dominik Koeppl and Simon J. Puglisi
    • Journal Title

      Algorithmica

      Volume: 86 Issue: 3 Pages: 735-756

    • DOI

      10.1007/s00453-023-01125-z

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] On the Hardness of Smallest RLSLPs and Collage Systems2024

    • Author(s)
      Akiyoshi Kawamoto, Tomohiro I, Dominik Koeppl, Hideo Bannai
    • Journal Title

      Proc. Data Compression Conference (DCC) 2024

      Volume: - Pages: 243-252

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Breaking a Barrier in Constructing Compact Indexes for Parameterized Pattern Matching2024

    • Author(s)
      Kento Iseri, Tomohiro I, Diptarama Hendrian, Dominik Koeppl, Ryo Yoshinaka, Ayumi Shinohara
    • Journal Title

      Proc. 51st International Colloquium on Automata, Languages, and Programming (ICALP) 2024

      Volume: -

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Longest bordered and periodic subsequences2023

    • Author(s)
      Hideo Bannai and Tomohiro I and Dominik Koeppl
    • Journal Title

      Inf. Process. Lett.

      Volume: 182 Pages: 1-6

    • DOI

      10.1016/j.ipl.2023.106398

    • Related Report
      2023 Annual Research Report 2022 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] PalFM-index: FM-index for Palindrome Pattern Matching2023

    • Author(s)
      Shinya Nagashita, Tomohiro I
    • Journal Title

      Proc. 34th Annual Symposium on Combinatorial Pattern Matching (CPM) 2023

      Volume: -

    • Related Report
      2023 Annual Research Report 2022 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Space-Efficient B Trees via Load-Balancing2022

    • Author(s)
      Tomohiro I, Dominik Koeppl
    • Journal Title

      Proc. 33rd International Workshop on Combinatorial Algorithms (IWOCA) 2022

      Volume: - Pages: 327-340

    • DOI

      10.1007/978-3-031-06678-8_24

    • ISBN
      9783031066771, 9783031066788
    • Related Report
      2022 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Computing Longest (Common) Lyndon Subsequences2022

    • Author(s)
      Hideo Bannai, Tomohiro I, Tomasz Kociumaka, Dominik Koeppl, Simon J. Puglisi
    • Journal Title

      Proc. 33rd International Workshop on Combinatorial Algorithms (IWOCA) 2022

      Volume: - Pages: 128-142

    • DOI

      10.1007/978-3-031-06678-8_10

    • ISBN
      9783031066771, 9783031066788
    • Related Report
      2022 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Converting RLBWT to LZ77 in smaller space2022

    • Author(s)
      Masaki Shigekuni, Tomohiro I
    • Journal Title

      IEEE Computer Society Press CPS Online

      Volume: - Pages: 242-251

    • Related Report
      2021 Research-status Report
    • Peer Reviewed
  • [Journal Article] PHONI: Streamed Matching Statistics with Multi-genome References2021

    • Author(s)
      Christina Boucher, Travis Gagie, Tomohiro I, Dominik Koeppl, Ben Langmead, Giovanni Manzini, Gonzalo Navarro, Alejandro Pacheco, Massimiliano Rossi
    • Journal Title

      Proc. Data Compression Conference (DCC) 2021

      Volume: - Pages: 193-202

    • Related Report
      2020 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Deterministic Sparse Suffix Sorting in the Restore Model2020

    • Author(s)
      Johannes Fischer, Tomohiro I, Dominik Koeppl
    • Journal Title

      ACM Transactions on Algorithms

      Volume: 16(4) Issue: 4 Pages: 1-53

    • DOI

      10.1145/3398681

    • Related Report
      2020 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Re-Pair in Small Space2020

    • Author(s)
      Dominik Koeppl, Tomohiro I, Isamu Furuya, Yoshimasa Takabatake, Kensuke Sakai, Keisuke Goto,
    • Journal Title

      Algorithms

      Volume: 14(1) Issue: 1 Pages: 1-20

    • DOI

      10.3390/a14010005

    • Related Report
      2020 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Practical Random Access to SLP-Compressed Texts2020

    • Author(s)
      Travis Gagie, Tomohiro I, Giovanni Manzini, Gonzalo Navarro, Hiroshi Sakamoto, Louisa Seelbach Benkner, Yoshimasa Takabatake,
    • Journal Title

      Proc. 27th International Symposium on String Processing and Information Retrieval (SPIRE) 2020

      Volume: - Pages: 221-231

    • DOI

      10.1007/978-3-030-59212-7_16

    • ISBN
      9783030592110, 9783030592127
    • Related Report
      2020 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Dynamic index and LZ factorization in compressed space2020

    • Author(s)
      Takaaki Nishimoto, Tomohiro I, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda
    • Journal Title

      Discrete Applied Mathematics

      Volume: 274 Pages: 116-129

    • DOI

      10.1016/j.dam.2019.01.014

    • Related Report
      2019 Research-status Report
    • Peer Reviewed
  • [Journal Article] Faster Privacy-Preserving Computation of Edit Distance with Moves2020

    • Author(s)
      Yohei Yoshimoto, Masaharu Kataoka, Yoshimasa Takabatake, Tomohiro I, Kilho Shin, Hiroshi Sakamoto
    • Journal Title

      Proc. International Workshop on Algorithms and Computation (WALCOM) 2020

      Volume: - Pages: 308-320

    • DOI

      10.1007/978-3-030-39881-1_26

    • ISBN
      9783030398804, 9783030398811
    • Related Report
      2019 Research-status Report
    • Peer Reviewed
  • [Journal Article] Rpair: Rescaling RePair with Rsync2019

    • Author(s)
      Travis Gagie, Tomohiro I, Giovanni Manzini, Gonzalo Navarro, Hiroshi Sakamoto, Yoshimasa Takabatake
    • Journal Title

      Proc. 26th International Symposium on String Processing and Information Retrieval (SPIRE) 2019

      Volume: - Pages: 35-44

    • DOI

      10.1007/978-3-030-32686-9_3

    • ISBN
      9783030326852, 9783030326869
    • Related Report
      2019 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] k-Abelian Pattern Matching: Revisited, Corrected, and Extended2019

    • Author(s)
      Golnaz Badkobeh, Hideo Bannai, Maxime Crochemore, Tomohiro I, Shunsuke Inenaga, Shiho Sugimoto
    • Journal Title

      Proc. Prague Stringology Conference 2019

      Volume: - Pages: 29-40

    • Related Report
      2019 Research-status Report
    • Peer Reviewed
  • [Presentation] On the Hardness of Smallest RLSLPs and Collage Systems2024

    • Author(s)
      Akiyoshi Kawamoto, Tomohiro I, Dominik Koeppl, Hideo Bannai
    • Organizer
      Data Compression Conference (DCC) 2024
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] PalFM-index: FM-index for Palindrome Pattern Matching2023

    • Author(s)
      Shinya Nagashita and Tomohiro I
    • Organizer
      34th Annual Symposium on Combinatorial Pattern Matching (CPM) 2023
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] PalFM-index: FM-index for Palindrome Pattern Matching2023

    • Author(s)
      Shinya Nagashita, Tomohiro I
    • Organizer
      34th Annual Symposium on Combinatorial Pattern Matching (CPM) 2023
    • Related Report
      2022 Research-status Report
    • Int'l Joint Research
  • [Presentation] Space-Efficient B Trees via Load-Balancing2022

    • Author(s)
      Tomohiro I, Dominik Koeppl
    • Organizer
      33rd International Workshop on Combinatorial Algorithms (IWOCA) 2022
    • Related Report
      2022 Research-status Report
    • Int'l Joint Research
  • [Presentation] Computing Longest (Common) Lyndon Subsequences2022

    • Author(s)
      Hideo Bannai, Tomohiro I, Tomasz Kociumaka, Dominik Koeppl, Simon J. Puglisi
    • Organizer
      33rd International Workshop on Combinatorial Algorithms (IWOCA) 2022
    • Related Report
      2022 Research-status Report
    • Int'l Joint Research
  • [Presentation] Converting RLBWT to LZ77 in smaller space2022

    • Author(s)
      Masaki Shigekuni, Tomohiro I
    • Organizer
      Data Compression Conference 2022
    • Related Report
      2021 Research-status Report
    • Int'l Joint Research
  • [Presentation] PHONI: Streamed Matching Statistics with Multi-genome References2021

    • Author(s)
      Christina Boucher, Travis Gagie, Tomohiro I, Dominik Koeppl, Ben Langmead, Giovanni Manzini, Gonzalo Navarro, Alejandro Pacheco, Massimiliano Rossi
    • Organizer
      Data Compression Conference (DCC) 2021
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Re-Pair in Small Space2020

    • Author(s)
      Dominik Koeppl, Tomohiro I, Isamu Furuya, Yoshimasa Takabatake, Kensuke Sakai, Keisuke Goto
    • Organizer
      Prague Stringology Conference (PSC) 2020
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Practical Random Access to SLP-Compressed Texts2020

    • Author(s)
      Travis Gagie, Tomohiro I, Giovanni Manzini, Gonzalo Navarro, Hiroshi Sakamoto, Louisa Seelbach Benkner, Yoshimasa Takabatake
    • Organizer
      27th International Symposium on String Processing and Information Retrieval (SPIRE) 2020
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Rpair: Rescaling RePair with Rsync2019

    • Author(s)
      Tomohiro I
    • Organizer
      SPIRE
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research

URL: 

Published: 2019-04-18   Modified: 2025-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi