Town trip forecasting based on data mining techniques

Q2 Engineering
Mohammad Fili, Majid Khedmati
{"title":"Town trip forecasting based on data mining techniques","authors":"Mohammad Fili, Majid Khedmati","doi":"10.30495/JIEI.2020.678774","DOIUrl":null,"url":null,"abstract":"In this paper, a data mining approach is proposed for duration prediction of the town trips (travel time) in New York City. In this regard, at first, two novel approaches, including a mathematical and a statistical approach, are proposed for grouping categorical variables with a huge number of levels. The proposed approaches work based on the cost matrix generated by repetitive post-hoc tests for different pairs. Then, a random forest model is constructed for the prediction of the type of trips, short or long. Finally, based on the trip type and each of the mathematical and statistical approaches, separate artificial neural networks (ANN) are developed to predict the duration time of the trips. According to the results, the mathematical approach performs better and provides more accurate results than the statistical approach. In addition, the proposed methods are compared with some other methods in the literature in which the results show that they perform better than all other methods. The RMSE of mathematical and statistical approaches is, respectively, 4.23 and 4.27 minutes for short trips, and the related value is 9.5 minutes for long trips. In addition, a modified version of the nearest neighborhood approach, entitled modified nearest neighborhood (MNN), is proposed for the prediction of the trip duration. This model resulted in accurate predictions where its RMSE is 4.45 minutes.","PeriodicalId":37850,"journal":{"name":"Journal of Industrial Engineering International","volume":"154 1","pages":"1-13"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Industrial Engineering International","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30495/JIEI.2020.678774","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, a data mining approach is proposed for duration prediction of the town trips (travel time) in New York City. In this regard, at first, two novel approaches, including a mathematical and a statistical approach, are proposed for grouping categorical variables with a huge number of levels. The proposed approaches work based on the cost matrix generated by repetitive post-hoc tests for different pairs. Then, a random forest model is constructed for the prediction of the type of trips, short or long. Finally, based on the trip type and each of the mathematical and statistical approaches, separate artificial neural networks (ANN) are developed to predict the duration time of the trips. According to the results, the mathematical approach performs better and provides more accurate results than the statistical approach. In addition, the proposed methods are compared with some other methods in the literature in which the results show that they perform better than all other methods. The RMSE of mathematical and statistical approaches is, respectively, 4.23 and 4.27 minutes for short trips, and the related value is 9.5 minutes for long trips. In addition, a modified version of the nearest neighborhood approach, entitled modified nearest neighborhood (MNN), is proposed for the prediction of the trip duration. This model resulted in accurate predictions where its RMSE is 4.45 minutes.
基于数据挖掘技术的城镇旅行预测
本文提出了一种数据挖掘方法,用于预测纽约市城镇出行(旅行时间)的持续时间。在这方面,首先提出了两种新的方法,包括数学方法和统计方法,用于分组具有大量水平的分类变量。所提出的方法基于对不同对的重复事后测试生成的代价矩阵。然后,构建了一个随机森林模型来预测短途或长途旅行的类型。最后,基于出行类型和各种数学和统计方法,建立了单独的人工神经网络(ANN)来预测出行持续时间。结果表明,数学方法比统计方法具有更好的性能和更精确的结果。此外,将所提出的方法与文献中其他一些方法进行了比较,结果表明它们的性能优于所有其他方法。数学方法和统计方法的RMSE在短途旅行中分别为4.23和4.27分钟,在长途旅行中相关值为9.5分钟。此外,本文还提出了一种改进的最近邻法,称为修正最近邻法(MNN),用于行程时间的预测。该模型预测准确,均方根误差为4.45分钟。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Industrial Engineering International
Journal of Industrial Engineering International Engineering-Industrial and Manufacturing Engineering
CiteScore
4.20
自引率
0.00%
发文量
0
审稿时长
12 weeks
期刊介绍: Journal of Industrial Engineering International is an international journal dedicated to the latest advancement of industrial engineering. The goal of this journal is to provide a platform for engineers and academicians all over the world to promote, share, and discuss various new issues and developments in different areas of industrial engineering. All manuscripts must be prepared in English and are subject to a rigorous and fair peer-review process. Accepted articles will immediately appear online. The journal publishes original research articles, review articles, technical notes, case studies and letters to the Editor, including but not limited to the following fields: Operations Research and Decision-Making Models, Production Planning and Inventory Control, Supply Chain Management, Quality Engineering, Applications of Fuzzy Theory in Industrial Engineering, Applications of Stochastic Models in Industrial Engineering, Applications of Metaheuristic Methods in Industrial Engineering.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信