A Parallel FP-Growth Mining Algorithm with Load Balancing Constraints for Traffic Crash Data

Yang Yang, Na Tian, Yunpeng Wang, Zhen-zhou Yuan
{"title":"A Parallel FP-Growth Mining Algorithm with Load Balancing Constraints for Traffic Crash Data","authors":"Yang Yang, Na Tian, Yunpeng Wang, Zhen-zhou Yuan","doi":"10.15837/ijccc.2022.4.4806","DOIUrl":null,"url":null,"abstract":"Traffic safety is an important part of the roadway in sustainable development. Freeway traffic crashes typically cause serious casualties and property losses, being a serious threat to public safety. Figuring out the potential correlation between various risk factors and revealing their coupling mechanisms are of effective ways to explore and identity freeway crash causes. However, the existing association rule mining algorithms still have some limitations in both efficiency and accuracy. Based on this consideration, using the freeway traffic crash data obtained from WDOT (Washington Department of Transportation), this research constructed a multi-dimensional multilevel system for traffic crash analysis. Considering the load balancing, the FP-Growth (Frequent Pattern- Growth) algorithm was optimized parallelly based on Hadoop platform, to achieve an efficient and accurate association rule mining calculation for massive amounts of traffic crash data; then, according to the results of the coupling mechanism among the crash precursors, the causes of freeway traffic crashes were identified and revealed. The results show that the parallel FPgrowth algorithm with load balancing constraints has a better operating speed than both the conventional FP-growth algorithm and parallel FP-growth algorithm towards processing big data. This improved algorithm makes full use of Hadoop cluster resources and is more suitable for large traffic crash data sets mining while retaining the original advantages of conventional association rule mining algorithm. In addition, the mining association rules model with the improvement of multi-dimensional interaction proposed in this research can catch the occurrence mechanism of freeway traffic crash with serious consequences (lower support degree probably) accurately and efficiently.","PeriodicalId":179619,"journal":{"name":"Int. J. Comput. Commun. Control","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Comput. Commun. Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15837/ijccc.2022.4.4806","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Traffic safety is an important part of the roadway in sustainable development. Freeway traffic crashes typically cause serious casualties and property losses, being a serious threat to public safety. Figuring out the potential correlation between various risk factors and revealing their coupling mechanisms are of effective ways to explore and identity freeway crash causes. However, the existing association rule mining algorithms still have some limitations in both efficiency and accuracy. Based on this consideration, using the freeway traffic crash data obtained from WDOT (Washington Department of Transportation), this research constructed a multi-dimensional multilevel system for traffic crash analysis. Considering the load balancing, the FP-Growth (Frequent Pattern- Growth) algorithm was optimized parallelly based on Hadoop platform, to achieve an efficient and accurate association rule mining calculation for massive amounts of traffic crash data; then, according to the results of the coupling mechanism among the crash precursors, the causes of freeway traffic crashes were identified and revealed. The results show that the parallel FPgrowth algorithm with load balancing constraints has a better operating speed than both the conventional FP-growth algorithm and parallel FP-growth algorithm towards processing big data. This improved algorithm makes full use of Hadoop cluster resources and is more suitable for large traffic crash data sets mining while retaining the original advantages of conventional association rule mining algorithm. In addition, the mining association rules model with the improvement of multi-dimensional interaction proposed in this research can catch the occurrence mechanism of freeway traffic crash with serious consequences (lower support degree probably) accurately and efficiently.
基于负载均衡约束的交通碰撞数据并行fp增长挖掘算法
交通安全是道路可持续发展的重要组成部分。高速公路交通事故通常会造成严重的人员伤亡和财产损失,对公共安全构成严重威胁。找出各种危险因素之间的潜在关联,揭示其耦合机制,是探索和识别高速公路碰撞原因的有效途径。然而,现有的关联规则挖掘算法在效率和准确性上都存在一定的局限性。基于此,本研究利用WDOT (Washington Department of Transportation)获取的高速公路交通碰撞数据,构建了一个多维多层次的交通碰撞分析系统。考虑到负载均衡,基于Hadoop平台并行优化FP-Growth (frequency Pattern- Growth)算法,实现对海量流量崩溃数据高效、准确的关联规则挖掘计算;然后,根据碰撞前兆之间耦合机制的结果,识别并揭示高速公路交通碰撞的原因。结果表明,负载均衡约束下的并行FP-growth算法在处理大数据方面比传统的FP-growth算法和并行FP-growth算法都有更好的运算速度。该改进算法充分利用Hadoop集群资源,在保留传统关联规则挖掘算法原有优势的同时,更适合于大型交通崩溃数据集挖掘。此外,本研究提出的改进多维交互的关联规则挖掘模型能够准确、高效地捕捉后果严重(可能较低支撑度)的高速公路交通碰撞的发生机制。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信