文字列圧縮と組合せ論による大規模データ管理・処理技法の開発
Project/Area Number |
18F18120
|
Research Category |
Grant-in-Aid for JSPS Fellows
|
Allocation Type | Single-year Grants |
Section | 外国 |
Research Field |
Theory of informatics
|
Research Institution | Tokyo Medical and Dental University (2020) Kyushu University (2018-2019) |
Principal Investigator |
稲永 俊介 (2018-2019) 九州大学, システム情報科学研究院, 准教授 (60448404)
|
Co-Investigator(Kenkyū-buntansha) |
Koeppl Dominik 東京医科歯科大学, M&Dデータ科学センター, 助教 (50897395)
KOEPPL DOMINIK 九州大学, システム情報科学研究院, 外国人特別研究員
|
Project Period (FY) |
2018-10-12 – 2021-03-31
|
Project Status |
Completed (Fiscal Year 2020)
|
Budget Amount *help |
¥1,400,000 (Direct Cost: ¥1,400,000)
Fiscal Year 2020: ¥400,000 (Direct Cost: ¥400,000)
Fiscal Year 2019: ¥600,000 (Direct Cost: ¥600,000)
Fiscal Year 2018: ¥400,000 (Direct Cost: ¥400,000)
|
Keywords | data structures / algorithms / lossless compression / hashing / アルゴリズム / データ構造 / 文字列データ処理 / tries / text indexing |
Outline of Annual Research Achievements |
The focus of this research was set on (a) practical and dynamic trie data structures, (b) the computation of the grammar compression Re-Pair in small space, and (c) advancements for the bijective Burrows-Wheeler transform (BBWT), a variant of the Burrows-Wheeler transform (BWT) well received in theory as well as in practice for indexing string data. (a) We have devised a novel approach for compact hashing, which is the most memory-efficient approach in practice when working with a huge number of integer keys of a bounded domain. Based on this approach, we have proposed dynamic trie data structures working with path-decomposition or with trie compaction. (b) Re-Pair, a grammar with high compression ratios, is difficult to compute within limited amount of memory. Here, we could find a quadratic time algorithm computing Re-Pair with almost no additional space. We also devised an index data structure build upon a grammar representing the Lyndon tree. This index exploits several properties of the Lyndon words to improve the running time of the currently fastest grammar index from a quadratic factor on the pattern length to a linear one. (c) Finally, we could build an indexing data structure on top of the BBWT, compute the BBWT in-place or transform the BWT into the BBWT, and finally build the BBWT in linear time. Asides from that, we could find space-efficient factorization algorithms for the non-overlapping LZ77 factorization and the LZ78 substring compression problem. These algorithms work in near-linear time with space asymptotic to the input text length in bits.
|
Research Progress Status |
令和2年度が最終年度であるため、記入しない。
|
Strategy for Future Research Activity |
令和2年度が最終年度であるため、記入しない。
|
Report
(3 results)
Research Products
(43 results)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
[Journal Article] Re-Pair in Small Space2020
Author(s)
Dominik Koeppl, Tomohiro I, Isamu Furuya, Yoshimasa Takabatake, Kensuke Sakai, Keisuke Goto,
-
Journal Title
Algorithms
Volume: 14(1)
Issue: 1
Pages: 1-20
DOI
Related Report
Peer Reviewed / Open Access / Int'l Joint Research
-
[Journal Article] Indexing the Bijective BWT2019
Author(s)
Hideo Bannai, Juha Karkkainen, Dominik Koeppl, Marcin Piatkowski
-
Journal Title
Proceedings of the 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019)
Volume: LIPIcs 128
DOI
Related Report
Peer Reviewed / Open Access / Int'l Joint Research
-
-
[Journal Article] Compact data structure for shortest unique substring queries2019
Author(s)
Takuya Mieno, Dominik Koeppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda
-
Journal Title
Proceedings of 26th International Symposium on String Processing and Information Retrieval, Lecture Notes in Computer Science
Volume: 11811
Pages: 107-123
DOI
ISBN
9783030326852, 9783030326869
Related Report
Peer Reviewed
-
[Journal Article] Indexing the Bijective BWT2019
Author(s)
Hideo Bannai, Juha Karkkainen, Dominik Koeppl and Marcin Piatkowski
-
Journal Title
Proc. 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019)
Volume: to appear
Related Report
Peer Reviewed / Open Access / Int'l Joint Research
-
-
-
-
-
-
-
-
-
-
-
-
-
-