String Indexing Based on Space-Optimal Grammar Compression and Its Application to Knowledge Discovery from Stream Data
Project/Area Number |
18K18111
|
Research Category |
Grant-in-Aid for Early-Career Scientists
|
Allocation Type | Multi-year Fund |
Review Section |
Basic Section 61030:Intelligent informatics-related
|
Research Institution | Kyushu Institute of Technology |
Principal Investigator |
|
Project Period (FY) |
2018-04-01 – 2021-03-31
|
Project Status |
Completed (Fiscal Year 2020)
|
Budget Amount *help |
¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000)
Fiscal Year 2020: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2019: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2018: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
|
Keywords | データ圧縮 / 圧縮索引 / 圧縮情報処理 / 文法圧縮 / BWT / 文字列検索 / ランダムアクセス / 圧縮検索 / 秘匿計算 / 移動付き編集距離 / テキストデータ圧縮 / オンラインアルゴリズム |
Outline of Final Research Achievements |
Highly repetitive texts exceed TB and are still increasing. In this research, we developed grammar compressions and Online Run-Length BWTs (ORLBWTs), which can compress such large streaming data at high speed in compressed space. Furthermore, we developed various information processes on the compressed data. Although we could not develop a grammar-based compressed index supporting real-time keyword searches on large streaming data, we significantly improved the construction time of ORLBWTs and our ORLBWTs resulted in the development of an ORLBWT-based compressed index supporting real-time searches on large streaming data [Bannai et al. TCS2020].
|
Academic Significance and Societal Importance of the Research Achievements |
開発した文法圧縮やOnline Run-Length BWT (ORLBWT)をTB超のデータをさらに省メモリかつ高速に圧縮可能になった.また,開発したORLBWTを応用したリアルタイムキーワード検索可能な圧縮索引を用いることで巨大なストリームデータから効率的に情報抽出可能となった.また,開発した各種圧縮情報処理技術を応用することで巨大なストリームデータからのリアルタイムの知識発見が可能とすることが期待できる.
|
Report
(4 results)
Research Products
(16 results)
-
-
-
-
-
[Journal Article] Re-Pair in Small Space2020
Author(s)
Dominik Koeppl, Tomohiro I, Isamu Furuya, Yoshimasa Takabatake, Kensuke Sakai, Keisuke Goto,
-
Journal Title
Algorithms
Volume: 14(1)
Issue: 1
Pages: 1-20
DOI
Related Report
Peer Reviewed / Open Access / Int'l Joint Research
-
-
[Presentation] Practical Random Access to SLP-Compressed Texts2020
Author(s)
Travis Gagie, Tomohiro I, Giovanni Manzini, Gonzalo Navarro, Hiroshi Sakamoto, Louisa Seelbach Benkner, Yoshimasa Takabatake
Organizer
The 27th International Symposium on String Processing and Information Retrieval (SPIRE)
Related Report
Int'l Joint Research
-
[Presentation] Re-Pair in Small Space2020
Author(s)
Dominik Koppl, Tomohiro I, Isamu Furuya, Yoshimasa Takabatake, Kensuke Sakai, Keisuke Goto
Organizer
Prague Stringology Conference (PSC) 2020
Related Report
Int'l Joint Research
-
-
[Presentation] Re-Pair in Small Space2020
Author(s)
Dominik Dominik K{\"{o}}ppl , Tomohiro I, Isamu Furuya, Yoshimasa Takabatake, Kensuke Sakai, Keisuke Goto
Organizer
Data Compression Conference
Related Report
Int'l Joint Research
-
[Presentation] Rpair: Rescaling RePair with Rsync2019
Author(s)
Travis Gagie, Tomohiro I, Giovanni Manzini, Gonzalo Navarro, Hiroshi Sakamoto, Yoshimasa Takabatake
Organizer
The 26th International Symposium on String Processing and Information Retrieval
Related Report
Int'l Joint Research
-
-
-
-
-