研究実績の概要 |
One of the major steps towards practically improved data structures was an in-depth analysis of hash tables. Here, we have worked with Shunsuke Kanda and Katsuya Tsuruta on different trie data structures employing hash tables in a clever way to speed up queries, or slim down their space usage. On a more general topic, I (Koeppl) could devise together with Rajeev Raman and Simon Puglisi two compact hash tables, which are optimized for fast construction while using less memory than any other known hash table. These hash tables help to improve associative containers in situations where insertion of big data is the most vital operation. The work with Shunsuke Kanda et al. has been sent to a journal, the work with Katsuya Tsuruta et al. got accepted at DCC'2020, and the work with Rajeev Raman and Simon Puglisi got accepted at SEA'2020.
I (Koeppl) set another research focus on the bijective Burrows-Wheeler transform (BBWT) [Gil and Scott, arXiv 2012]. Here, we devised a self-index on the BBWT, resulting into a conference paper at CPM'2019. Next, we found a connection between the BBWT and suffix sorting, resulting into a linear-time construction algorithm. We published this result on arXiv, and plan to submit the results combined with practical evaluations. To further understand the relation between the BBWT and BWT, together with researcher of Prof. Ayumi Shinohara's laboratory at Touhoku University, we studied conversions between these two transformations, and got the discoveries of this study accepted at CPM'2020.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
2: おおむね順調に進展している
理由
It is hard to judge whether the current status is delayed or in schedule. Most recent results have been accepted at conferences (twice in DCC 2020, once in CPM 2020, and once in SEA 2020), but there are not yet any proceedings available. I do not think that any of the journal articles I submitted with my colleagues during the JSPS program will get published before the scholarship ends, as the journal publication process in theoretical computer science, especially in renominated journals like Algorithmica or TCS, takes unfortunately very long time. The current results also spark new research questions, which I probably cannot completely answer during the JSPS program. Overall, I am satisfied with the current research status, and I am confident that the achievements during the two years program will be considered as worthwhile.
|
今後の研究の推進方策 |
For the following period of six months, I have two projects in mind. The first is to analyze different tools to speed up and slim down the Lempel-Ziv 78 factorization for which we have elaborated the main tools such as a compact hash table (i.e., the SEA'2020 publication). The plan is to elaborate an exhaustive study submit-able to a journal. The second is to find new possibilities in indexing integer and real matrices within compressed space. The aim is to augment the computed grammar with an indexing data structure for accelerating common matrix operations such as multiplication. There are currently no sophisticated approaches in how to exploit two-dimensional data by means of a grammar sufficiently. The first objective would be to propose an approach that exploits the shape of the two-dimensional data in such a way that the grammar is much smaller than a string grammar built on the serialization of a matrix. The second objective would be to propose an indexing data structure for common matrix operations that needs less space than the plain matrix while performing an operation faster. Another line of research in this topic is to study ways of computing already proposed grammars in less time, ideally in optimal time in the word-packing model.
|