GeoFlink: A real-time and highly scalable processing framework for the spatial data streams
Project/Area Number |
20K19806
|
Research Category |
Grant-in-Aid for Early-Career Scientists
|
Allocation Type | Multi-year Fund |
Review Section |
Basic Section 60080:Database-related
|
Research Institution | National Institute of Advanced Industrial Science and Technology |
Principal Investigator |
SHAIKH SALMAN AHMED 国立研究開発法人産業技術総合研究所, 情報・人間工学領域, 主任研究員 (30742621)
|
Project Period (FY) |
2020-04-01 – 2023-03-31
|
Project Status |
Completed (Fiscal Year 2022)
|
Budget Amount *help |
¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000)
Fiscal Year 2021: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2020: ¥2,730,000 (Direct Cost: ¥2,100,000、Indirect Cost: ¥630,000)
|
Keywords | GeoFlink / Scalable Processing / Spatial Stream / Continuous Queries / Spatial Indexing / Range Query / Knn Query / Join Query / Spatial Geometries / Range / Knn / Join |
Outline of Research at the Start |
With the increase in the use of GPS-enabled devices, spatial data is omnipresent. Many applications require real-time processing of spatial data, for instance, to guide people to safety in a disaster; which may include real-time processing of billions of tuples/second. This work proposes GeoFlink, which is a real-time and highly scalable processing framework for the spatial data streams. GeoFlink will enable the processing of highly dynamic spatial data streams efficiently by extending one of the state-of-the-art big data streaming platforms, i.e., Apache Flink, as the base system.
|
Outline of Final Research Achievements |
With the advancement in data collection technologies, there is an increase in spatial data. Spatial data is huge and many time requires real-time processing. This project focuses on the research and development of a scalable and real-time spatial data stream management system GeoFlink. GeoFlink extends Apache Flink to support spatial data types, indexes and continuous queries over spatial data streams. To enable efficient processing of continuous queries and for the effective data distribution across computing cluster nodes, a gird-based index is introduced. GeoFlink supports spatial range, spatial kNN and spatial join queries on point, multi-point, line, multi-line, polygon and multi-polygon geometry types. Extensive experimental study on real spatial data streams proves that GeoFlink achieves significantly higher query throughput than ordinary Flink processing. In this project, we published 3 conference and 1 journal papers. GeoFlink is open source and is registered as an AIST IP.
|
Academic Significance and Societal Importance of the Research Achievements |
Our proposed framework GeoFlink enables low-latency continuous queries (range, knn, and join) processing over spatial data streams. The GeoFlink, being real-time spatial data processing framework, can be used for target marketing, disaster management, autonomous driving, robots path guidance, etc.
|
Report
(4 results)
Research Products
(4 results)