{"title":"Ensemble methods for route choice","authors":"Haotian Wang, Emily Moylan, David Levinson","doi":"10.1016/j.trc.2024.104803","DOIUrl":null,"url":null,"abstract":"<div><p>Understanding travellers’ route preferences allows for the calculation of traffic flow on network segments and helps in assessing facility requirements, costs, and the impact of network modifications. Most research employs logit-based choice methods to model the route choices of individuals, but machine learning models are gaining increasing interest. However, all of these methods typically rely on a single ‘best’ model for predictions, which may be sensitive to measurement errors in the training data. Moreover, predictions from discarded models might still provide insights into route choices. The ensemble approach combines outcomes from multiple models using various pattern recognition methods, assumptions, and/or data sets to deliver improved predictions. When configured correctly, ensemble models offer greater prediction accuracy and account for uncertainties. To examine the advantages of ensemble techniques, a data set from the I-35 W Bridge Collapse study in 2008, and another from the 2011 Travel Behavior Inventory (TBI), both in Minneapolis–St. Paul (The Twin Cities) are used to train a set of route choice models and combine them with ensemble techniques. The analysis considered travellers’ socio-demographics and trip attributes. The trained models are applied to two datasets, the Longitudinal Employer-Household Dynamics (LEHD) commute trips and TBI morning peak trips, for validation. Predictions are also compared with the loop detector records on freeway links. Traditional Multinomial Logit and Path-Size Logit models, along with machine learning methods such as Decision Tree, Random Forest, Extra Tree, AdaBoost, Support Vector Machine, and Neural Network, serve as the foundation for this study. Ensemble rules are tested in both case studies, including hard voting, soft voting, ranked choice voting, and stacking. Based on the results, heterogeneous ensembles using soft voting outperform the base models and other ensemble rules on testing sets.</p></div>","PeriodicalId":54417,"journal":{"name":"Transportation Research Part C-Emerging Technologies","volume":"167 ","pages":"Article 104803"},"PeriodicalIF":7.6000,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0968090X24003243/pdfft?md5=38db06abbc21d7a29bd4a8c9436738ca&pid=1-s2.0-S0968090X24003243-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part C-Emerging Technologies","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0968090X24003243","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Understanding travellers’ route preferences allows for the calculation of traffic flow on network segments and helps in assessing facility requirements, costs, and the impact of network modifications. Most research employs logit-based choice methods to model the route choices of individuals, but machine learning models are gaining increasing interest. However, all of these methods typically rely on a single ‘best’ model for predictions, which may be sensitive to measurement errors in the training data. Moreover, predictions from discarded models might still provide insights into route choices. The ensemble approach combines outcomes from multiple models using various pattern recognition methods, assumptions, and/or data sets to deliver improved predictions. When configured correctly, ensemble models offer greater prediction accuracy and account for uncertainties. To examine the advantages of ensemble techniques, a data set from the I-35 W Bridge Collapse study in 2008, and another from the 2011 Travel Behavior Inventory (TBI), both in Minneapolis–St. Paul (The Twin Cities) are used to train a set of route choice models and combine them with ensemble techniques. The analysis considered travellers’ socio-demographics and trip attributes. The trained models are applied to two datasets, the Longitudinal Employer-Household Dynamics (LEHD) commute trips and TBI morning peak trips, for validation. Predictions are also compared with the loop detector records on freeway links. Traditional Multinomial Logit and Path-Size Logit models, along with machine learning methods such as Decision Tree, Random Forest, Extra Tree, AdaBoost, Support Vector Machine, and Neural Network, serve as the foundation for this study. Ensemble rules are tested in both case studies, including hard voting, soft voting, ranked choice voting, and stacking. Based on the results, heterogeneous ensembles using soft voting outperform the base models and other ensemble rules on testing sets.
期刊介绍:
Transportation Research: Part C (TR_C) is dedicated to showcasing high-quality, scholarly research that delves into the development, applications, and implications of transportation systems and emerging technologies. Our focus lies not solely on individual technologies, but rather on their broader implications for the planning, design, operation, control, maintenance, and rehabilitation of transportation systems, services, and components. In essence, the intellectual core of the journal revolves around the transportation aspect rather than the technology itself. We actively encourage the integration of quantitative methods from diverse fields such as operations research, control systems, complex networks, computer science, and artificial intelligence. Join us in exploring the intersection of transportation systems and emerging technologies to drive innovation and progress in the field.