2021 Fiscal Year Annual Research Report
Efficient Query Processing for Learning-based Data Management
Project/Area Number |
19K11979
|
Research Institution | Osaka University |
Principal Investigator |
肖 川 大阪大学, 情報科学研究科, 准教授 (10643900)
|
Project Period (FY) |
2019-04-01 – 2022-03-31
|
Keywords | query processing / ML + DB / high-dimensional data / similarity search |
Outline of Annual Research Achievements |
There were three major achievements in FY2021, the final year of this research project. First, we studied the problem of efficient query processing for embeddings, a fundamental operation in data science and machine learning tasks. By utilizing hierarchical graph structures, we proposed a novel indexing approach to approximate nearest neighbor search for real-valued high-dimensional data. The experimental evaluation showed that with the same query processing time constraint, the proposed approach improves recall rates by 3% - 10% when compared to existing solutions, and it requires less indexing time than existing solutions. We published our discoveries at the Proceedings of the VLDB Endowment (PVLDB), 2021. Second, we finished system prototyping and released the source codes of our software at GitHub. The released software includes the programs used in our papers published at ACM SIGMOD 2021 and PVLDB 2021. Third, we reported our discoveries in this project and gave a tutorial on querying high-dimensional data at ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2021, a top-tier conference of the data science research community.
|