研究実績の概要 |
There were three major achievements in FY2021, the final year of this research project. First, we studied the problem of efficient query processing for embeddings, a fundamental operation in data science and machine learning tasks. By utilizing hierarchical graph structures, we proposed a novel indexing approach to approximate nearest neighbor search for real-valued high-dimensional data. The experimental evaluation showed that with the same query processing time constraint, the proposed approach improves recall rates by 3% - 10% when compared to existing solutions, and it requires less indexing time than existing solutions. We published our discoveries at the Proceedings of the VLDB Endowment (PVLDB), 2021. Second, we finished system prototyping and released the source codes of our software at GitHub. The released software includes the programs used in our papers published at ACM SIGMOD 2021 and PVLDB 2021. Third, we reported our discoveries in this project and gave a tutorial on querying high-dimensional data at ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2021, a top-tier conference of the data science research community.
|