Efficient and scalable DBSCAN framework for clustering continuous trajectories in road networks

IF 4.3 1区地球科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

International Journal of Geographical Information Science Pub Date : 2023-06-01 DOI:10.1080/13658816.2023.2217443

B. Chen, Yuhua Luo, Yu Zhang, Tao Jia, Hui-Ping Chen, Jianya Gong, Qingquan Li

{"title":"Efficient and scalable DBSCAN framework for clustering continuous trajectories in road networks","authors":"B. Chen, Yuhua Luo, Yu Zhang, Tao Jia, Hui-Ping Chen, Jianya Gong, Qingquan Li","doi":"10.1080/13658816.2023.2217443","DOIUrl":null,"url":null,"abstract":"Abstract Clustering the trajectories of vehicles moving on road networks is a key data mining technique for understanding human mobility patterns, as well as their interactions with urban environments. The development of efficient and scalable trajectory clustering algorithms, however, still faces challenges because of the computational costs when measuring similarities among a large number of network-constrained trajectories. To address this problem, a novel trajectory clustering framework based on the well-developed Density-Based Spatial Clustering of Applications with Noise (DBSCAN) approach is proposed. This proposed framework accurately quantifies similarities using a trajectory representation of continuous polylines in the space and time dimensions, and does not require trajectory discretization. Further, the proposed framework utilizes the space-time buffering concept to formulate -neighborhood queries that directly retrieve the -neighbors of trajectories and thus avoids computing a trajectory similarity matrix. State-of-the-art trajectory databases and index structures are incorporated to further improve trajectory clustering performance. A comprehensive case study was carried out using an open dataset of 20,161 trajectories. Results show that the proposed framework efficiently executed trajectory clustering on the large test dataset within 3 min. This was approximately 2,700 times faster than existing DBSCAN algorithms.","PeriodicalId":14162,"journal":{"name":"International Journal of Geographical Information Science","volume":"37 1","pages":"1693 - 1727"},"PeriodicalIF":4.3000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Geographical Information Science","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1080/13658816.2023.2217443","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Abstract Clustering the trajectories of vehicles moving on road networks is a key data mining technique for understanding human mobility patterns, as well as their interactions with urban environments. The development of efficient and scalable trajectory clustering algorithms, however, still faces challenges because of the computational costs when measuring similarities among a large number of network-constrained trajectories. To address this problem, a novel trajectory clustering framework based on the well-developed Density-Based Spatial Clustering of Applications with Noise (DBSCAN) approach is proposed. This proposed framework accurately quantifies similarities using a trajectory representation of continuous polylines in the space and time dimensions, and does not require trajectory discretization. Further, the proposed framework utilizes the space-time buffering concept to formulate -neighborhood queries that directly retrieve the -neighbors of trajectories and thus avoids computing a trajectory similarity matrix. State-of-the-art trajectory databases and index structures are incorporated to further improve trajectory clustering performance. A comprehensive case study was carried out using an open dataset of 20,161 trajectories. Results show that the proposed framework efficiently executed trajectory clustering on the large test dataset within 3 min. This was approximately 2,700 times faster than existing DBSCAN algorithms.

查看原文本刊更多论文

用于道路网络中连续轨迹聚类的高效且可扩展的DBSCAN框架

对道路网络上行驶的车辆轨迹进行聚类是理解人类移动模式及其与城市环境相互作用的关键数据挖掘技术。然而，高效、可扩展的轨迹聚类算法的发展仍然面临挑战，因为在测量大量网络约束轨迹之间的相似性时，计算成本很高。为了解决这一问题，提出了一种新的轨迹聚类框架，该框架基于基于密度的带噪声应用空间聚类(DBSCAN)方法。该框架使用连续折线在空间和时间维度上的轨迹表示准确地量化相似性，并且不需要轨迹离散化。此外，提出的框架利用时空缓冲概念来制定直接检索轨迹近邻的邻域查询，从而避免了计算轨迹相似矩阵。结合了最先进的轨迹数据库和索引结构，进一步提高了轨迹聚类性能。使用包含20161条轨迹的开放数据集进行了全面的案例研究。结果表明，该框架能在3 min内有效地对大型测试数据集进行轨迹聚类。这比现有的DBSCAN算法大约快2700倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Geographical Information Science 地学-计算机：信息系统

CiteScore

11.00

自引率

7.00%

发文量

审稿时长

9 months

期刊介绍： International Journal of Geographical Information Science provides a forum for the exchange of original ideas, approaches, methods and experiences in the rapidly growing field of geographical information science (GIScience). It is intended to interest those who research fundamental and computational issues of geographic information, as well as issues related to the design, implementation and use of geographical information for monitoring, prediction and decision making. Published research covers innovations in GIScience and novel applications of GIScience in natural resources, social systems and the built environment, as well as relevant developments in computer science, cartography, surveying, geography and engineering in both developed and developing countries.