改进跨网络公交出行时间预测的通用化策略

IF 5 2区社会学 Q1 URBAN STUDIES

Journal of Urban Management Pub Date : 2024-05-29 DOI:10.1016/j.jum.2024.05.002

Zack Aemmer , Sondre Sørbø , Alfredo Clemente , Massimiliano Ruocco

{"title":"改进跨网络公交出行时间预测的通用化策略","authors":"Zack Aemmer , Sondre Sørbø , Alfredo Clemente , Massimiliano Ruocco","doi":"10.1016/j.jum.2024.05.002","DOIUrl":null,"url":null,"abstract":"<div><p>This study focuses on developing and evaluating predictive models for bus travel times adaptable to any transit network, or to new roadway segments without prior travel time data. Most prior work relies on non-standardized features such as road traffic forecasts or closed-source datasets to test predictions on a single route or network. We leverage standardized and open-source data from GTFS and GTFS-RT feeds to gather four months of realtime bus position data from Seattle and Trondheim's transit networks. We then test and refine strategies for generalizing model predictions across both locations. To achieve this, we first develop a data pipeline to process and clean the raw data, then extract features from the standardized sources. We then evaluate the performance of several deep learning and heuristic models in predicting bus travel times between source and target bus networks. Holdout data is taken from selected routes in the source city to validate the internal generalization of the models. Data from the target city is used to evaluate the external generalization of the models. An ablation study explores the impact of different open data sources on model generalization (GPS, static timetables, OpenStreetMap and other realtime trips). We then extend the analysis to 33 international bus networks, placing the results in broader context and testing fine-tuning strategies for generalization. Results show that deep learning methods generalize well within the source network, with as little as 1% loss in MAPE on holdout routes. With minimal fine-tuning generalization is significantly improved on the target network. Model features built on static schedule data, realtime positions or OpenStreetMap embeddings improved generalization performance (up to 10% reduction in MAPE). This was more pronounced for networks with a greater initial quantity of training data. As a route-planning tool for roadways without prior data, geospatial data mining can provide reasonable bus travel time estimates. For cross-sectional bus network analysis, fine tuning on at least 100 trajectory samples for each target network is required to significantly outperform baseline heuristics. This necessitates a GTFS-RT or other standardized realtime data feed in the target city.</p></div>","PeriodicalId":45131,"journal":{"name":"Journal of Urban Management","volume":"13 3","pages":"Pages 372-385"},"PeriodicalIF":5.0000,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S222658562400061X/pdfft?md5=eb612411ade4cf7cbc7041868474b4f5&pid=1-s2.0-S222658562400061X-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Generalization strategies for improving bus travel time prediction across networks\",\"authors\":\"Zack Aemmer , Sondre Sørbø , Alfredo Clemente , Massimiliano Ruocco\",\"doi\":\"10.1016/j.jum.2024.05.002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This study focuses on developing and evaluating predictive models for bus travel times adaptable to any transit network, or to new roadway segments without prior travel time data. Most prior work relies on non-standardized features such as road traffic forecasts or closed-source datasets to test predictions on a single route or network. We leverage standardized and open-source data from GTFS and GTFS-RT feeds to gather four months of realtime bus position data from Seattle and Trondheim's transit networks. We then test and refine strategies for generalizing model predictions across both locations. To achieve this, we first develop a data pipeline to process and clean the raw data, then extract features from the standardized sources. We then evaluate the performance of several deep learning and heuristic models in predicting bus travel times between source and target bus networks. Holdout data is taken from selected routes in the source city to validate the internal generalization of the models. Data from the target city is used to evaluate the external generalization of the models. An ablation study explores the impact of different open data sources on model generalization (GPS, static timetables, OpenStreetMap and other realtime trips). We then extend the analysis to 33 international bus networks, placing the results in broader context and testing fine-tuning strategies for generalization. Results show that deep learning methods generalize well within the source network, with as little as 1% loss in MAPE on holdout routes. With minimal fine-tuning generalization is significantly improved on the target network. Model features built on static schedule data, realtime positions or OpenStreetMap embeddings improved generalization performance (up to 10% reduction in MAPE). This was more pronounced for networks with a greater initial quantity of training data. As a route-planning tool for roadways without prior data, geospatial data mining can provide reasonable bus travel time estimates. For cross-sectional bus network analysis, fine tuning on at least 100 trajectory samples for each target network is required to significantly outperform baseline heuristics. This necessitates a GTFS-RT or other standardized realtime data feed in the target city.</p></div>\",\"PeriodicalId\":45131,\"journal\":{\"name\":\"Journal of Urban Management\",\"volume\":\"13 3\",\"pages\":\"Pages 372-385\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-05-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S222658562400061X/pdfft?md5=eb612411ade4cf7cbc7041868474b4f5&pid=1-s2.0-S222658562400061X-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Urban Management\",\"FirstCategoryId\":\"90\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S222658562400061X\",\"RegionNum\":2,\"RegionCategory\":\"社会学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"URBAN STUDIES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Urban Management","FirstCategoryId":"90","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S222658562400061X","RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"URBAN STUDIES","Score":null,"Total":0}

引用次数: 0

摘要

本研究的重点是开发和评估公交车旅行时间预测模型，使其适用于任何公交网络，或适用于没有先前旅行时间数据的新路段。之前的大部分工作都依赖于非标准化特征，如道路交通预测或封闭源数据集，以测试对单一线路或网络的预测。我们利用 GTFS 和 GTFS-RT 馈送的标准化开源数据，从西雅图和特隆赫姆的公交网络中收集了四个月的实时公交位置数据。然后，我们测试并改进了在这两个地点推广模型预测的策略。为此，我们首先开发了一个数据管道来处理和清理原始数据，然后从标准化来源中提取特征。然后，我们评估了几个深度学习和启发式模型在预测源公交网络和目标公交网络之间的公交旅行时间方面的性能。我们从源城市的选定线路中提取了保留数据，以验证模型的内部泛化能力。目标城市的数据用于评估模型的外部泛化。一项消融研究探讨了不同开放数据源（GPS、静态时刻表、OpenStreetMap 和其他实时行程）对模型泛化的影响。然后，我们将分析扩展到 33 个国际公交网络，将结果置于更广泛的背景中，并测试泛化的微调策略。结果表明，深度学习方法在源网络内的泛化效果很好，在保留路线上的 MAPE 损失率仅为 1%。通过最小化微调，目标网络的泛化效果显著提高。基于静态时间表数据、实时位置或 OpenStreetMap 嵌入建立的模型特征提高了泛化性能（MAPE 降低达 10%）。这对于初始训练数据量较大的网络而言更为明显。地理空间数据挖掘作为一种没有先验数据的道路路线规划工具，可以提供合理的公交车旅行时间估算。对于横断面公交网络分析，需要对每个目标网络的至少 100 个轨迹样本进行微调，才能明显优于基线启发式方法。这就需要在目标城市使用 GTFS-RT 或其他标准化的实时数据源。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Generalization strategies for improving bus travel time prediction across networks

This study focuses on developing and evaluating predictive models for bus travel times adaptable to any transit network, or to new roadway segments without prior travel time data. Most prior work relies on non-standardized features such as road traffic forecasts or closed-source datasets to test predictions on a single route or network. We leverage standardized and open-source data from GTFS and GTFS-RT feeds to gather four months of realtime bus position data from Seattle and Trondheim's transit networks. We then test and refine strategies for generalizing model predictions across both locations. To achieve this, we first develop a data pipeline to process and clean the raw data, then extract features from the standardized sources. We then evaluate the performance of several deep learning and heuristic models in predicting bus travel times between source and target bus networks. Holdout data is taken from selected routes in the source city to validate the internal generalization of the models. Data from the target city is used to evaluate the external generalization of the models. An ablation study explores the impact of different open data sources on model generalization (GPS, static timetables, OpenStreetMap and other realtime trips). We then extend the analysis to 33 international bus networks, placing the results in broader context and testing fine-tuning strategies for generalization. Results show that deep learning methods generalize well within the source network, with as little as 1% loss in MAPE on holdout routes. With minimal fine-tuning generalization is significantly improved on the target network. Model features built on static schedule data, realtime positions or OpenStreetMap embeddings improved generalization performance (up to 10% reduction in MAPE). This was more pronounced for networks with a greater initial quantity of training data. As a route-planning tool for roadways without prior data, geospatial data mining can provide reasonable bus travel time estimates. For cross-sectional bus network analysis, fine tuning on at least 100 trajectory samples for each target network is required to significantly outperform baseline heuristics. This necessitates a GTFS-RT or other standardized realtime data feed in the target city.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Urban Management URBAN STUDIES-

CiteScore

9.50

自引率

4.90%

发文量

审稿时长

65 days

期刊介绍： Journal of Urban Management (JUM) is the Official Journal of Zhejiang University and the Chinese Association of Urban Management, an international, peer-reviewed open access journal covering planning, administering, regulating, and governing urban complexity. JUM has its two-fold aims set to integrate the studies across fields in urban planning and management, as well as to provide a more holistic perspective on problem solving. 1) Explore innovative management skills for taming thorny problems that arise with global urbanization 2) Provide a platform to deal with urban affairs whose solutions must be looked at from an interdisciplinary perspective.