{"title":"比较计量经济学和机器学习模型对逃票预测的影响","authors":"Benedetto Barabino, Roberto Ventura","doi":"10.1016/j.trip.2025.101636","DOIUrl":null,"url":null,"abstract":"<div><div>Fare evasion poses a significant financial threat to Transit Agencies (TAs) and Public Transport Companies (PTCs) globally, especially within Proof-of-Payment Transit Systems (POP-TSs). Understanding and estimating fare evasion frequency is crucial for developing targeted countermeasures. Traditionally, Econometric Models (EMs) have been employed for this purpose, linking fare evasion frequency to specific predictors to assess their effects and significance. However, Machine Learning Models (MLMs) have recently emerged as promising tools, offering the potential for enhanced accuracy through complex data analysis. Despite their strengths, a comprehensive comparison between EMs and MLMs for predicting fare evasion frequency has been lacking in the literature.</div><div>This study addresses this gap by developing, calibrating, and validating two alternative frequency estimation models—an EM based on a Generalised Linear Regression Model (GLRM) and an MLM based on an Artificial Neural Network Model (ANNM). Using 4,000- real-world records from an Italian mid-sized PTC, the models’ performances are quantitatively assessed through regression plots, error metrics, and fare evasion event ratios. The findings indicate that ANNM slightly outperforms GLRM on the considered dataset, showing a higher correlation coefficient, reduced margin of error, and a fare evasion event ratio closer to one. Moreover, the predictor effects were explored, an area where ANNM’s “black box” nature traditionally limits transparency. An overview of these effects shows that while both models identify similar key factors, each prioritises different aspects of fare evasion influences. These insights would help TAs/PTCs select models based on data, interpretability needs, and fare evasion patterns, supporting more effective, data-driven management strategies.</div></div>","PeriodicalId":36621,"journal":{"name":"Transportation Research Interdisciplinary Perspectives","volume":"34 ","pages":"Article 101636"},"PeriodicalIF":3.8000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparing econometric and machine learning models to fare evasion prediction\",\"authors\":\"Benedetto Barabino, Roberto Ventura\",\"doi\":\"10.1016/j.trip.2025.101636\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Fare evasion poses a significant financial threat to Transit Agencies (TAs) and Public Transport Companies (PTCs) globally, especially within Proof-of-Payment Transit Systems (POP-TSs). Understanding and estimating fare evasion frequency is crucial for developing targeted countermeasures. Traditionally, Econometric Models (EMs) have been employed for this purpose, linking fare evasion frequency to specific predictors to assess their effects and significance. However, Machine Learning Models (MLMs) have recently emerged as promising tools, offering the potential for enhanced accuracy through complex data analysis. Despite their strengths, a comprehensive comparison between EMs and MLMs for predicting fare evasion frequency has been lacking in the literature.</div><div>This study addresses this gap by developing, calibrating, and validating two alternative frequency estimation models—an EM based on a Generalised Linear Regression Model (GLRM) and an MLM based on an Artificial Neural Network Model (ANNM). Using 4,000- real-world records from an Italian mid-sized PTC, the models’ performances are quantitatively assessed through regression plots, error metrics, and fare evasion event ratios. The findings indicate that ANNM slightly outperforms GLRM on the considered dataset, showing a higher correlation coefficient, reduced margin of error, and a fare evasion event ratio closer to one. Moreover, the predictor effects were explored, an area where ANNM’s “black box” nature traditionally limits transparency. An overview of these effects shows that while both models identify similar key factors, each prioritises different aspects of fare evasion influences. These insights would help TAs/PTCs select models based on data, interpretability needs, and fare evasion patterns, supporting more effective, data-driven management strategies.</div></div>\",\"PeriodicalId\":36621,\"journal\":{\"name\":\"Transportation Research Interdisciplinary Perspectives\",\"volume\":\"34 \",\"pages\":\"Article 101636\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transportation Research Interdisciplinary Perspectives\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S259019822500315X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"TRANSPORTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Interdisciplinary Perspectives","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S259019822500315X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TRANSPORTATION","Score":null,"Total":0}
Comparing econometric and machine learning models to fare evasion prediction
Fare evasion poses a significant financial threat to Transit Agencies (TAs) and Public Transport Companies (PTCs) globally, especially within Proof-of-Payment Transit Systems (POP-TSs). Understanding and estimating fare evasion frequency is crucial for developing targeted countermeasures. Traditionally, Econometric Models (EMs) have been employed for this purpose, linking fare evasion frequency to specific predictors to assess their effects and significance. However, Machine Learning Models (MLMs) have recently emerged as promising tools, offering the potential for enhanced accuracy through complex data analysis. Despite their strengths, a comprehensive comparison between EMs and MLMs for predicting fare evasion frequency has been lacking in the literature.
This study addresses this gap by developing, calibrating, and validating two alternative frequency estimation models—an EM based on a Generalised Linear Regression Model (GLRM) and an MLM based on an Artificial Neural Network Model (ANNM). Using 4,000- real-world records from an Italian mid-sized PTC, the models’ performances are quantitatively assessed through regression plots, error metrics, and fare evasion event ratios. The findings indicate that ANNM slightly outperforms GLRM on the considered dataset, showing a higher correlation coefficient, reduced margin of error, and a fare evasion event ratio closer to one. Moreover, the predictor effects were explored, an area where ANNM’s “black box” nature traditionally limits transparency. An overview of these effects shows that while both models identify similar key factors, each prioritises different aspects of fare evasion influences. These insights would help TAs/PTCs select models based on data, interpretability needs, and fare evasion patterns, supporting more effective, data-driven management strategies.