用于推算智能卡数据中缺失的登机站的监督机器学习模型。

IF 2.3 Q2 TRANSPORTATION SCIENCE & TECHNOLOGY
Public Transport Pub Date : 2023-01-01 Epub Date: 2022-12-07 DOI:10.1007/s12469-022-00309-0
Nadav Shalit, Michael Fire, Eran Ben-Elia
{"title":"用于推算智能卡数据中缺失的登机站的监督机器学习模型。","authors":"Nadav Shalit, Michael Fire, Eran Ben-Elia","doi":"10.1007/s12469-022-00309-0","DOIUrl":null,"url":null,"abstract":"<p><p>Public transport has become an essential part of urban existence with increased population densities and environmental awareness. Large quantities of data are currently generated, allowing for more robust methods to understand travel behavior by harvesting smart card usage. However, public transport datasets suffer from data integrity problems; boarding stop information may be missing due to imperfect acquirement processes or inadequate reporting. This study introduces a supervised machine learning method to impute missing boarding stops based on ordinal classification using GTFS timetable, smart card, and geospatial datasets. A new metric, Pareto Accuracy, is suggested to evaluate algorithms where classes have an ordinal nature. The results are based on a case study in the city of Beer Sheva, Israel, consisting of one month of smart card data. We show that our proposed method is robust to irregular travelers and significantly outperforms well-known imputation methods without the need to mine any additional datasets. The data validation from another Israeli city using transfer learning shows the presented model is general and context-free. The implications for transportation planning and travel behavior research are further discussed.</p>","PeriodicalId":46539,"journal":{"name":"Public Transport","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9734418/pdf/","citationCount":"0","resultStr":"{\"title\":\"A supervised machine learning model for imputing missing boarding stops in smart card data.\",\"authors\":\"Nadav Shalit, Michael Fire, Eran Ben-Elia\",\"doi\":\"10.1007/s12469-022-00309-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Public transport has become an essential part of urban existence with increased population densities and environmental awareness. Large quantities of data are currently generated, allowing for more robust methods to understand travel behavior by harvesting smart card usage. However, public transport datasets suffer from data integrity problems; boarding stop information may be missing due to imperfect acquirement processes or inadequate reporting. This study introduces a supervised machine learning method to impute missing boarding stops based on ordinal classification using GTFS timetable, smart card, and geospatial datasets. A new metric, Pareto Accuracy, is suggested to evaluate algorithms where classes have an ordinal nature. The results are based on a case study in the city of Beer Sheva, Israel, consisting of one month of smart card data. We show that our proposed method is robust to irregular travelers and significantly outperforms well-known imputation methods without the need to mine any additional datasets. The data validation from another Israeli city using transfer learning shows the presented model is general and context-free. The implications for transportation planning and travel behavior research are further discussed.</p>\",\"PeriodicalId\":46539,\"journal\":{\"name\":\"Public Transport\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9734418/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Public Transport\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s12469-022-00309-0\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2022/12/7 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"TRANSPORTATION SCIENCE & TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Public Transport","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s12469-022-00309-0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/12/7 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

随着人口密度和环保意识的提高,公共交通已成为城市生活的重要组成部分。目前已产生了大量数据,通过收集智能卡的使用情况,可以采用更强大的方法来了解人们的出行行为。然而,公共交通数据集存在数据完整性问题;由于获取流程不完善或报告不充分,可能会丢失乘车站信息。本研究介绍了一种监督机器学习方法,利用 GTFS 时刻表、智能卡和地理空间数据集,在序数分类的基础上对缺失的乘车站进行估算。研究提出了一个新指标--帕累托准确率,用于评估具有序数性质的算法。结果基于以色列比尔谢瓦市的一项案例研究,包括一个月的智能卡数据。结果表明,我们提出的方法对不规则旅行者具有鲁棒性,并且明显优于众所周知的估算方法,而无需挖掘任何额外的数据集。使用迁移学习方法对以色列另一个城市的数据进行验证,结果表明所提出的模型具有通用性且不受限于上下文。我们还进一步讨论了该模型对交通规划和旅行行为研究的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A supervised machine learning model for imputing missing boarding stops in smart card data.

Public transport has become an essential part of urban existence with increased population densities and environmental awareness. Large quantities of data are currently generated, allowing for more robust methods to understand travel behavior by harvesting smart card usage. However, public transport datasets suffer from data integrity problems; boarding stop information may be missing due to imperfect acquirement processes or inadequate reporting. This study introduces a supervised machine learning method to impute missing boarding stops based on ordinal classification using GTFS timetable, smart card, and geospatial datasets. A new metric, Pareto Accuracy, is suggested to evaluate algorithms where classes have an ordinal nature. The results are based on a case study in the city of Beer Sheva, Israel, consisting of one month of smart card data. We show that our proposed method is robust to irregular travelers and significantly outperforms well-known imputation methods without the need to mine any additional datasets. The data validation from another Israeli city using transfer learning shows the presented model is general and context-free. The implications for transportation planning and travel behavior research are further discussed.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Public Transport
Public Transport TRANSPORTATION SCIENCE & TECHNOLOGY-
CiteScore
5.40
自引率
15.40%
发文量
19
期刊介绍: The scope and purpose of the journal includes, but is not limited to, any type of research in the area of Public Transport: Planning and Operations. As its core it serves the primary mission of advancing the state of the art and the state of the practice in computer-aided systems and scheduling in public transport. The journal considers any type of subjects in this area especially with a focus to planning and scheduling, the common ground is the use of computer-aided methods and operations research techniques to improve information management, network and route planning, vehicle and crew scheduling and rostering, vehicle monitoring and management, and practical experience with scheduling and public transport planning methods. Besides theoretical papers, the journal also publishes case studies and applications. Public Transport addresses transport operators, consulting firms and academic institutions involved in development, utilization or research of computer-aided planning and scheduling in public transport.Officially cited as: Public Transp
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信