{"title":"The Vantage Index: Executing Distance Queries at Scale","authors":"Giannis Evagorou, M. Lavalle, T. Heinis","doi":"10.1145/3400903.3400933","DOIUrl":null,"url":null,"abstract":"Due to the proliferation of GPS-enabled devices, vast amounts of trajectory datasets are being collected every day. Analyzing this data efficiently and at scale is a major challenge. Several different types of spatio-temporal queries are used to analyze these datasets. One important query is the distance query on trajectory data which, given a query distance D, a point P and a time span T, finds all trajectories within D of P during T. This query is frequently used in traffic analysis and numerous other applications. In this paper we develop the means to efficiently and scalably analyse large amounts of trajectory data with the distance query. To this end we develop the means to distribute the trajectory data in a distributed infrastructure (Spark) as well as the index needed on the nodes to answer the query locally. As our experiments show, our approach is more efficient when compared to a baseline method.","PeriodicalId":334018,"journal":{"name":"32nd International Conference on Scientific and Statistical Database Management","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"32nd International Conference on Scientific and Statistical Database Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3400903.3400933","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Due to the proliferation of GPS-enabled devices, vast amounts of trajectory datasets are being collected every day. Analyzing this data efficiently and at scale is a major challenge. Several different types of spatio-temporal queries are used to analyze these datasets. One important query is the distance query on trajectory data which, given a query distance D, a point P and a time span T, finds all trajectories within D of P during T. This query is frequently used in traffic analysis and numerous other applications. In this paper we develop the means to efficiently and scalably analyse large amounts of trajectory data with the distance query. To this end we develop the means to distribute the trajectory data in a distributed infrastructure (Spark) as well as the index needed on the nodes to answer the query locally. As our experiments show, our approach is more efficient when compared to a baseline method.