César Maia de Souza , Roberto Ponce-Lopez , Gonzalo Gaudencio Peraza-Mues , Alejandro Antonio Dominguez-Cristerna , Eric J. Miller
{"title":"Imputing informal workers for transportation modeling in Latin America by the use of machine learning techniques","authors":"César Maia de Souza , Roberto Ponce-Lopez , Gonzalo Gaudencio Peraza-Mues , Alejandro Antonio Dominguez-Cristerna , Eric J. Miller","doi":"10.1016/j.latran.2025.100032","DOIUrl":null,"url":null,"abstract":"<div><div>The informal economy plays a critical role in the Global South, particularly in large urban areas such as Mexico City, where over 50 % of jobs belong to this segment. It is crucial to understand the travel behavior of informal workers and effectively integrate these patterns into advanced transportation models, such as activity-based models (ABMs). This study proposes a unique approach for identifying informal workers across distinct economic sectors in Mexico’s Monterrey metropolitan area, by utilizing an Origin-Destination survey and the National Occupation and Employment Survey (ENOE). Machine learning models, trained on the ENOE dataset and applied to the OD survey, first classified workers as formal or informal, and subsequently reassigned informal laborers who had initially been classified under “Other\" to the construction or commerce categories. The use of the Gradient Boosted Trees (GBT) classifier emerged as the optimal method, yielding accuracies of 78.0 % and 70.7 % for the two stages. Differences between predicted results and observed values fall within an acceptable range, especially in sectors with high informal worker rates. The resulting estimate and characterization of informal workers can potentially be integrated into ABMs, thereby providing a foundation for assessing the responses of informal workers to infrastructure policy interventions.</div></div>","PeriodicalId":100868,"journal":{"name":"Latin American Transport Studies","volume":"3 ","pages":"Article 100032"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Latin American Transport Studies","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2950024925000095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The informal economy plays a critical role in the Global South, particularly in large urban areas such as Mexico City, where over 50 % of jobs belong to this segment. It is crucial to understand the travel behavior of informal workers and effectively integrate these patterns into advanced transportation models, such as activity-based models (ABMs). This study proposes a unique approach for identifying informal workers across distinct economic sectors in Mexico’s Monterrey metropolitan area, by utilizing an Origin-Destination survey and the National Occupation and Employment Survey (ENOE). Machine learning models, trained on the ENOE dataset and applied to the OD survey, first classified workers as formal or informal, and subsequently reassigned informal laborers who had initially been classified under “Other" to the construction or commerce categories. The use of the Gradient Boosted Trees (GBT) classifier emerged as the optimal method, yielding accuracies of 78.0 % and 70.7 % for the two stages. Differences between predicted results and observed values fall within an acceptable range, especially in sectors with high informal worker rates. The resulting estimate and characterization of informal workers can potentially be integrated into ABMs, thereby providing a foundation for assessing the responses of informal workers to infrastructure policy interventions.