Publicly Offered Research
Grant-in-Aid for Transformative Research Areas (A)
Recent advances in technology has made it possible to collect vast amounts of biological data valuable for studying genetic diseases and devising individually targeted therapies. Unfortunately, while the collection of such data has gathered high momentum, we are unaware of solutions that can cope with the collected data efficiently while supporting biologically important queries under the restriction that privacy is respected. Such a solution can make it possible to discover insights into diseases and side effects of medical treatments caused by genetic variations.
For indexing biological data meaningful, we presented at SPIRE'22 two new approaches: The first is an augmentation of the r-index for improving the time for random accesses in the suffix array. This is usually done by a sequential application of the Phi-Array. This method has been experienced as slow in practice. We therefore could slightly improve the time by simulating the predecessor queries with a walk on a labelled graph, on which we can omit some of the predecessor queries. The second is for parameterized pattern matching, which is an extension of classic pattern matching. Here, we proposed the first efficient algorithm for computing the parameterized Burrows-Wheeler transform online.When it comes to computing matching statistics, we could practically improve the time for the computation with the r-index augmented with some helper data structures, in detail: a grammar with longest common extension (LCE) query support, and the thresholds array. While Bannai et al. [TCS'20] showed how to compute matching statistics with the r-index, we provided two successive improvements with a software called PHONI two years ago, and with a recent practical improvement by skipping some LCE queries by storing additional LCE values of the thresholds. We can justify this small space increase with a remarkable improvement in the query time since the LCE queries answered by the used grammar tend to be the bottleneck of the whole algorithm.
令和4年度が最終年度であるため、記入しない。
All 2023 2022 2021 Other
All Int'l Joint Research (8 results) Journal Article (22 results) (of which Int'l Joint Research: 22 results, Peer Reviewed: 22 results, Open Access: 12 results) Presentation (7 results) (of which Int'l Joint Research: 1 results) Remarks (2 results)
Analytics
Volume: 2 Issue: 1 Pages: 146-162
10.3390/analytics2010009
Information Processing Letters
Volume: 179 Pages: 1-8
10.1016/j.ipl.2022.106274
SN Computer Science
Volume: 3 Issue: 3 Pages: 1-8
10.1007/s42979-022-01084-2
Proc. 33rd International Workshop on Combinatorial Algorithms (IWOCA) 2022
Volume: - Pages: 128-142
10.1007/978-3-031-06678-8_10
Volume: - Pages: 327-340
10.1007/978-3-031-06678-8_24
Algorithms
Volume: 15(5) Issue: 5 Pages: 1-15
10.3390/a15050163
Algorithmica
Volume: 84 Issue: 9 Pages: 2735-2766
10.1007/s00453-022-00996-y
Proceedings of SPIRE
Volume: 13617 Pages: 70-85
10.1007/978-3-031-20643-6_6
Volume: 13617 Pages: 86-98
10.1007/978-3-031-20643-6_7
Proceedings of ESA
Volume: 244
Proc. VLDB
Volume: 15 Issue: 10 Pages: 2175-2187
10.14778/3547305.3547321
Proc. DCC
Volume: 83--92 Pages: 63-72
10.1109/dcc52660.2022.00014
Volume: 2022 Pages: 83-92
10.1109/dcc52660.2022.00016
Volume: 2022 Pages: 232-241
10.1109/dcc52660.2022.00031
Information
Volume: 13 Issue: 4 Pages: 168-168
10.3390/info13040168
Information and Computation
Volume: - Pages: 104794-104794
10.1016/j.ic.2021.104794
Volume: 14 Issue: 6 Pages: 161-161
10.3390/a14060161
Proceedings of CPM
Volume: 191
Proc. SPIRE
Volume: 12944 Pages: 143-150
10.1007/978-3-030-86692-1_12
Proceedings of 28th International Symposium on String Processing and Information Retrieval
Volume: 12944 Pages: 85-99
10.1007/978-3-030-86692-1_8
Proceedings of the 28th International Symposium on String Processing and Information Retrieval (SPIRE 2021)
Volume: LNCS 12944 Pages: 167-178
10.1007/978-3-030-86692-1_14
ACM JEA
Volume: 26 Pages: 1-47
10.1145/3481638
https://dkppl.de/