Ensemble methods for route choice

IF 7.6 1区工程技术 Q1 TRANSPORTATION SCIENCE & TECHNOLOGY

Transportation Research Part C-Emerging Technologies Pub Date : 2024-08-24 DOI:10.1016/j.trc.2024.104803

Haotian Wang, Emily Moylan, David Levinson

{"title":"Ensemble methods for route choice","authors":"Haotian Wang, Emily Moylan, David Levinson","doi":"10.1016/j.trc.2024.104803","DOIUrl":null,"url":null,"abstract":"<div><p>Understanding travellers’ route preferences allows for the calculation of traffic flow on network segments and helps in assessing facility requirements, costs, and the impact of network modifications. Most research employs logit-based choice methods to model the route choices of individuals, but machine learning models are gaining increasing interest. However, all of these methods typically rely on a single ‘best’ model for predictions, which may be sensitive to measurement errors in the training data. Moreover, predictions from discarded models might still provide insights into route choices. The ensemble approach combines outcomes from multiple models using various pattern recognition methods, assumptions, and/or data sets to deliver improved predictions. When configured correctly, ensemble models offer greater prediction accuracy and account for uncertainties. To examine the advantages of ensemble techniques, a data set from the I-35 W Bridge Collapse study in 2008, and another from the 2011 Travel Behavior Inventory (TBI), both in Minneapolis–St. Paul (The Twin Cities) are used to train a set of route choice models and combine them with ensemble techniques. The analysis considered travellers’ socio-demographics and trip attributes. The trained models are applied to two datasets, the Longitudinal Employer-Household Dynamics (LEHD) commute trips and TBI morning peak trips, for validation. Predictions are also compared with the loop detector records on freeway links. Traditional Multinomial Logit and Path-Size Logit models, along with machine learning methods such as Decision Tree, Random Forest, Extra Tree, AdaBoost, Support Vector Machine, and Neural Network, serve as the foundation for this study. Ensemble rules are tested in both case studies, including hard voting, soft voting, ranked choice voting, and stacking. Based on the results, heterogeneous ensembles using soft voting outperform the base models and other ensemble rules on testing sets.</p></div>","PeriodicalId":54417,"journal":{"name":"Transportation Research Part C-Emerging Technologies","volume":"167 ","pages":"Article 104803"},"PeriodicalIF":7.6000,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0968090X24003243/pdfft?md5=38db06abbc21d7a29bd4a8c9436738ca&pid=1-s2.0-S0968090X24003243-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part C-Emerging Technologies","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0968090X24003243","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Understanding travellers’ route preferences allows for the calculation of traffic flow on network segments and helps in assessing facility requirements, costs, and the impact of network modifications. Most research employs logit-based choice methods to model the route choices of individuals, but machine learning models are gaining increasing interest. However, all of these methods typically rely on a single ‘best’ model for predictions, which may be sensitive to measurement errors in the training data. Moreover, predictions from discarded models might still provide insights into route choices. The ensemble approach combines outcomes from multiple models using various pattern recognition methods, assumptions, and/or data sets to deliver improved predictions. When configured correctly, ensemble models offer greater prediction accuracy and account for uncertainties. To examine the advantages of ensemble techniques, a data set from the I-35 W Bridge Collapse study in 2008, and another from the 2011 Travel Behavior Inventory (TBI), both in Minneapolis–St. Paul (The Twin Cities) are used to train a set of route choice models and combine them with ensemble techniques. The analysis considered travellers’ socio-demographics and trip attributes. The trained models are applied to two datasets, the Longitudinal Employer-Household Dynamics (LEHD) commute trips and TBI morning peak trips, for validation. Predictions are also compared with the loop detector records on freeway links. Traditional Multinomial Logit and Path-Size Logit models, along with machine learning methods such as Decision Tree, Random Forest, Extra Tree, AdaBoost, Support Vector Machine, and Neural Network, serve as the foundation for this study. Ensemble rules are tested in both case studies, including hard voting, soft voting, ranked choice voting, and stacking. Based on the results, heterogeneous ensembles using soft voting outperform the base models and other ensemble rules on testing sets.

查看原文本刊更多论文

路线选择的集合方法

通过了解旅客的路线偏好，可以计算网络路段的交通流量，并有助于评估设施需求、成本和网络改造的影响。大多数研究采用基于 logit 的选择方法来模拟个人的路线选择，但机器学习模型也越来越受到关注。然而，所有这些方法通常都依赖于单一的 "最佳 "模型进行预测，这可能会对训练数据中的测量误差非常敏感。此外，被放弃的模型的预测结果仍有可能为路线选择提供启示。集合方法采用不同的模式识别方法、假设和/或数据集，将多个模型的结果结合起来，以提供更好的预测。如果配置得当，集合模型可提供更高的预测精度，并考虑到不确定性。为了检验集合技术的优势，我们使用了 2008 年 I-35 W 桥垮塌研究的数据集和 2011 年明尼阿波利斯-圣保罗（双子城）旅行行为清单（TBI）的数据集来训练一组路线选择模型，并将其与集合技术相结合。分析考虑了旅行者的社会人口统计学和旅行属性。训练好的模型应用于两个数据集，即纵向雇主-家庭动态（LEHD）通勤出行数据集和 TBI 早高峰出行数据集，以进行验证。预测结果还与高速公路连接线上的环路检测器记录进行了比较。传统的多叉 Logit 模型和路径大小 Logit 模型，以及决策树、随机森林、额外树、AdaBoost、支持向量机和神经网络等机器学习方法是本研究的基础。在两个案例研究中都测试了集合规则，包括硬投票、软投票、排序选择投票和堆叠。根据结果，在测试集上，使用软投票的异构集合优于基础模型和其他集合规则。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Transportation Research Part C-Emerging Technologies 工程技术-运输科技

CiteScore

15.80

自引率

12.00%

发文量

332

审稿时长

64 days

期刊介绍： Transportation Research: Part C (TR_C) is dedicated to showcasing high-quality, scholarly research that delves into the development, applications, and implications of transportation systems and emerging technologies. Our focus lies not solely on individual technologies, but rather on their broader implications for the planning, design, operation, control, maintenance, and rehabilitation of transportation systems, services, and components. In essence, the intellectual core of the journal revolves around the transportation aspect rather than the technology itself. We actively encourage the integration of quantitative methods from diverse fields such as operations research, control systems, complex networks, computer science, and artificial intelligence. Join us in exploring the intersection of transportation systems and emerging technologies to drive innovation and progress in the field.