{"title":"利用交通数据聚类揭示交通网络中的代表性日型","authors":"","doi":"10.1080/15472450.2023.2205020","DOIUrl":null,"url":null,"abstract":"<div><p>Recognition of spatio-temporal traffic patterns at the network-wide level plays an important role in data-driven intelligent transport systems (ITS) and is a basis for applications such as short-term prediction and scenario-based traffic management. Common practice in the transport literature is to rely on well-known general unsupervised machine-learning methods (e.g., k-means, hierarchical, spectral, DBSCAN) to select the most representative structure and number of day-types based solely on internal evaluation indices. These are easy to calculate but are limited since they only use information in the clustered dataset itself. In addition, the quality of clustering should ideally be demonstrated by external validation criteria, by expert assessment or the performance in its intended application. The main contribution of this paper is to test and compare the common practice of internal validation with external validation criteria represented by the application to short-term prediction, which also serves as a proxy for more general traffic management applications. When compared to external evaluation using short-term prediction, internal evaluation methods have a tendency to underestimate the number of representative day-types needed for the application. Additionally, the paper investigates the impact of using dimensionality reduction. By using just 0.1% of the original dataset dimensions, very similar clustering and prediction performance can be achieved, with up to 20 times lower computational costs, depending on the clustering method. K-means and agglomerative clustering may be the most scalable methods, using up to 60 times fewer computational resources for very similar prediction performance to the p-median clustering.</p></div>","PeriodicalId":54792,"journal":{"name":"Journal of Intelligent Transportation Systems","volume":"28 5","pages":"Pages 695-718"},"PeriodicalIF":2.8000,"publicationDate":"2023-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Revealing representative day-types in transport networks using traffic data clustering\",\"authors\":\"\",\"doi\":\"10.1080/15472450.2023.2205020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Recognition of spatio-temporal traffic patterns at the network-wide level plays an important role in data-driven intelligent transport systems (ITS) and is a basis for applications such as short-term prediction and scenario-based traffic management. Common practice in the transport literature is to rely on well-known general unsupervised machine-learning methods (e.g., k-means, hierarchical, spectral, DBSCAN) to select the most representative structure and number of day-types based solely on internal evaluation indices. These are easy to calculate but are limited since they only use information in the clustered dataset itself. In addition, the quality of clustering should ideally be demonstrated by external validation criteria, by expert assessment or the performance in its intended application. The main contribution of this paper is to test and compare the common practice of internal validation with external validation criteria represented by the application to short-term prediction, which also serves as a proxy for more general traffic management applications. When compared to external evaluation using short-term prediction, internal evaluation methods have a tendency to underestimate the number of representative day-types needed for the application. Additionally, the paper investigates the impact of using dimensionality reduction. By using just 0.1% of the original dataset dimensions, very similar clustering and prediction performance can be achieved, with up to 20 times lower computational costs, depending on the clustering method. K-means and agglomerative clustering may be the most scalable methods, using up to 60 times fewer computational resources for very similar prediction performance to the p-median clustering.</p></div>\",\"PeriodicalId\":54792,\"journal\":{\"name\":\"Journal of Intelligent Transportation Systems\",\"volume\":\"28 5\",\"pages\":\"Pages 695-718\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2023-08-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Intelligent Transportation Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/org/science/article/pii/S1547245023000841\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"TRANSPORTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent Transportation Systems","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/org/science/article/pii/S1547245023000841","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"TRANSPORTATION","Score":null,"Total":0}
引用次数: 0
摘要
全网层面的时空交通模式识别在数据驱动型智能交通系统(ITS)中发挥着重要作用,也是短期预测和基于场景的交通管理等应用的基础。交通文献中的常见做法是依靠众所周知的通用无监督机器学习方法(如 k-means、分层、光谱、DBSCAN),仅根据内部评估指数来选择最具代表性的结构和日类型数量。这些指标易于计算,但却有局限性,因为它们只能使用聚类数据集本身的信息。此外,聚类的质量最好还能通过外部验证标准、专家评估或在预期应用中的表现来证明。本文的主要贡献在于测试和比较了内部验证与外部验证标准的常见做法,后者以短期预测的应用为代表,短期预测也可作为更一般的交通管理应用的代表。与使用短期预测的外部评估相比,内部评估方法倾向于低估应用所需的代表性日类型的数量。此外,本文还研究了使用降维方法的影响。只需使用原始数据集维度的 0.1%,就能实现非常相似的聚类和预测性能,而且根据聚类方法的不同,计算成本最多可降低 20 倍。K 均值聚类和聚类聚类可能是最具扩展性的方法,使用的计算资源最多可减少 60 倍,而预测性能却与 p 中值聚类非常相似。
Revealing representative day-types in transport networks using traffic data clustering
Recognition of spatio-temporal traffic patterns at the network-wide level plays an important role in data-driven intelligent transport systems (ITS) and is a basis for applications such as short-term prediction and scenario-based traffic management. Common practice in the transport literature is to rely on well-known general unsupervised machine-learning methods (e.g., k-means, hierarchical, spectral, DBSCAN) to select the most representative structure and number of day-types based solely on internal evaluation indices. These are easy to calculate but are limited since they only use information in the clustered dataset itself. In addition, the quality of clustering should ideally be demonstrated by external validation criteria, by expert assessment or the performance in its intended application. The main contribution of this paper is to test and compare the common practice of internal validation with external validation criteria represented by the application to short-term prediction, which also serves as a proxy for more general traffic management applications. When compared to external evaluation using short-term prediction, internal evaluation methods have a tendency to underestimate the number of representative day-types needed for the application. Additionally, the paper investigates the impact of using dimensionality reduction. By using just 0.1% of the original dataset dimensions, very similar clustering and prediction performance can be achieved, with up to 20 times lower computational costs, depending on the clustering method. K-means and agglomerative clustering may be the most scalable methods, using up to 60 times fewer computational resources for very similar prediction performance to the p-median clustering.
期刊介绍:
The Journal of Intelligent Transportation Systems is devoted to scholarly research on the development, planning, management, operation and evaluation of intelligent transportation systems. Intelligent transportation systems are innovative solutions that address contemporary transportation problems. They are characterized by information, dynamic feedback and automation that allow people and goods to move efficiently. They encompass the full scope of information technologies used in transportation, including control, computation and communication, as well as the algorithms, databases, models and human interfaces. The emergence of these technologies as a new pathway for transportation is relatively new.
The Journal of Intelligent Transportation Systems is especially interested in research that leads to improved planning and operation of the transportation system through the application of new technologies. The journal is particularly interested in research that adds to the scientific understanding of the impacts that intelligent transportation systems can have on accessibility, congestion, pollution, safety, security, noise, and energy and resource consumption.
The journal is inter-disciplinary, and accepts work from fields of engineering, economics, planning, policy, business and management, as well as any other disciplines that contribute to the scientific understanding of intelligent transportation systems. The journal is also multi-modal, and accepts work on intelligent transportation for all forms of ground, air and water transportation. Example topics include the role of information systems in transportation, traffic flow and control, vehicle control, routing and scheduling, traveler response to dynamic information, planning for ITS innovations, evaluations of ITS field operational tests, ITS deployment experiences, automated highway systems, vehicle control systems, diffusion of ITS, and tools/software for analysis of ITS.