检查严重伤亡卡车事故中的交通违规行为:叙述性报告的文本挖掘和可靠网络分析。

IF 1.9 3区 工程技术 Q3 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Yunfei Zhao, Kai Kang, Wenjian Jia, Zhe Guo, Jie Zhang, Tong Zhu
{"title":"检查严重伤亡卡车事故中的交通违规行为:叙述性报告的文本挖掘和可靠网络分析。","authors":"Yunfei Zhao, Kai Kang, Wenjian Jia, Zhe Guo, Jie Zhang, Tong Zhu","doi":"10.1080/15389588.2025.2553194","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Trucks are more likely to be involved in severe casualty crashes compared with other vehicle types. The elimination of traffic violations is crucial to preventing severe casualty truck crashes. However, there is a lack of comprehensive analyses of truck violations and their conditions related to severe casualty crashes. This study aims to identify thematic communities of truck driver violations through a modeling framework integrating text mining and reliable network analysis.</p><p><strong>Methods: </strong>This study collected 432 textual reports of severe truck casualty crashes in China from 2013 to 2020, which were divided into crash narratives and metadata for separate preprocessing. For the narrative part, the ELECTRA model was used for Chinese word segmentation and part-of-speech tagging, and keywords were extracted by combining with TF-IDF. The metadata was processed through named entity recognition, geocoding, etc., and then merged with the narrative keywords. Association rules were mined by the Apriori algorithm to construct a network with keywords as nodes and lift values as edge weights, which was visualized by the ForceAtlas2 algorithm. The Leiden algorithm was adopted to detect thematic communities, whose significance was validated by QStest.</p><p><strong>Results: </strong>Text mining results reveal 77 most relevant keywords extracted from 432 police narratives. Overloading and speeding emerge as predominant traffic violations, correlating with 43% and 30% of severe casualty truck crashes, respectively. A total of four overloading and five speeding statistically significant thematic communities are identified. Notably, the circumstances associated with truck overloading and speeding manifest distinct characteristics. For overloading, conditions contributing to severe casualty crashes encompass rural highways with curves or slopes, provincial or national highways in the afternoon, expressways during nighttime, and locations proximate to signalized intersections. In contrast, five circumstances are linked to speeding: curved or sloped road segments during the afternoon, rural highways in autumn, straight road sections during the night, work zone areas on four-lane roadways, and un-signalized intersections on weekdays. Moreover, we also extracted vehicle and driver features across diverse environments, facilitating the identification of key elements for preventing severe casualty truck crashes. For instance, light trucks exhibit a higher susceptibility to severe casualty crashes attributed to overloading on rural highways.</p><p><strong>Conclusions: </strong>This study demonstrates the advantages of textual data and reliable network analysis. Text data analysis proves to be more convenient, yielding a richer array of comprehensive information while demanding less subjective judgment. The findings of this paper inform consequent enforcement and engineering measures for mitigating severe casualty truck crashes.</p>","PeriodicalId":54422,"journal":{"name":"Traffic Injury Prevention","volume":" ","pages":"1-10"},"PeriodicalIF":1.9000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Examining traffic violations in severe casualty truck crashes: A text mining and reliable network analysis of narrative reports.\",\"authors\":\"Yunfei Zhao, Kai Kang, Wenjian Jia, Zhe Guo, Jie Zhang, Tong Zhu\",\"doi\":\"10.1080/15389588.2025.2553194\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>Trucks are more likely to be involved in severe casualty crashes compared with other vehicle types. The elimination of traffic violations is crucial to preventing severe casualty truck crashes. However, there is a lack of comprehensive analyses of truck violations and their conditions related to severe casualty crashes. This study aims to identify thematic communities of truck driver violations through a modeling framework integrating text mining and reliable network analysis.</p><p><strong>Methods: </strong>This study collected 432 textual reports of severe truck casualty crashes in China from 2013 to 2020, which were divided into crash narratives and metadata for separate preprocessing. For the narrative part, the ELECTRA model was used for Chinese word segmentation and part-of-speech tagging, and keywords were extracted by combining with TF-IDF. The metadata was processed through named entity recognition, geocoding, etc., and then merged with the narrative keywords. Association rules were mined by the Apriori algorithm to construct a network with keywords as nodes and lift values as edge weights, which was visualized by the ForceAtlas2 algorithm. The Leiden algorithm was adopted to detect thematic communities, whose significance was validated by QStest.</p><p><strong>Results: </strong>Text mining results reveal 77 most relevant keywords extracted from 432 police narratives. Overloading and speeding emerge as predominant traffic violations, correlating with 43% and 30% of severe casualty truck crashes, respectively. A total of four overloading and five speeding statistically significant thematic communities are identified. Notably, the circumstances associated with truck overloading and speeding manifest distinct characteristics. For overloading, conditions contributing to severe casualty crashes encompass rural highways with curves or slopes, provincial or national highways in the afternoon, expressways during nighttime, and locations proximate to signalized intersections. In contrast, five circumstances are linked to speeding: curved or sloped road segments during the afternoon, rural highways in autumn, straight road sections during the night, work zone areas on four-lane roadways, and un-signalized intersections on weekdays. Moreover, we also extracted vehicle and driver features across diverse environments, facilitating the identification of key elements for preventing severe casualty truck crashes. For instance, light trucks exhibit a higher susceptibility to severe casualty crashes attributed to overloading on rural highways.</p><p><strong>Conclusions: </strong>This study demonstrates the advantages of textual data and reliable network analysis. Text data analysis proves to be more convenient, yielding a richer array of comprehensive information while demanding less subjective judgment. The findings of this paper inform consequent enforcement and engineering measures for mitigating severe casualty truck crashes.</p>\",\"PeriodicalId\":54422,\"journal\":{\"name\":\"Traffic Injury Prevention\",\"volume\":\" \",\"pages\":\"1-10\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2025-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Traffic Injury Prevention\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1080/15389588.2025.2553194\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Traffic Injury Prevention","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/15389588.2025.2553194","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0

摘要

目的:与其他类型的车辆相比,卡车更容易发生严重的伤亡事故。消除交通违规行为对于防止严重伤亡的卡车撞车事故至关重要。然而,缺乏对卡车违规及其与严重伤亡事故有关的情况的全面分析。本研究旨在通过整合文本挖掘和可靠网络分析的建模框架来识别卡车司机违规的主题社区。方法:本研究收集了2013年至2020年中国432起严重卡车伤亡事故的文本报告,将其分为事故叙述和元数据,分别进行预处理。叙述部分使用ELECTRA模型进行中文分词和词性标注,结合TF-IDF提取关键词。元数据通过命名实体识别、地理编码等处理,然后与叙事关键词合并。利用Apriori算法挖掘关联规则,构建以关键字为节点、提升值为边权的网络,利用ForceAtlas2算法对网络进行可视化。采用Leiden算法检测主题群落,通过QStest验证主题群落的显著性。结果:文本挖掘结果从432个警察叙述中提取了77个最相关的关键词。超载和超速是主要的交通违规行为,分别与43%和30%的严重伤亡卡车事故相关。总共确定了四个超载和五个超速统计上显著的专题社区。值得注意的是,与卡车超载和超速有关的情况表现出明显的特征。对于超载,造成严重伤亡事故的条件包括有弯道或斜坡的农村公路、下午的省道或国道公路、夜间的高速公路以及靠近信号交叉路口的地点。相比之下,有五种情况与超速有关:下午的弯曲或倾斜路段、秋季的农村高速公路、夜间的直线路段、四车道公路的工作区域、工作日的无信号交叉路口。此外,我们还提取了不同环境下的车辆和驾驶员特征,有助于识别防止严重伤亡卡车碰撞的关键因素。例如,轻型卡车更容易受到农村高速公路超载造成的严重伤亡事故的影响。结论:本研究显示了文本数据和可靠的网络分析的优势。文本数据分析被证明更方便,产生更丰富的综合信息,同时需要更少的主观判断。本文的研究结果为后续的执法和工程措施提供了信息,以减轻严重伤亡的卡车碰撞。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Examining traffic violations in severe casualty truck crashes: A text mining and reliable network analysis of narrative reports.

Objective: Trucks are more likely to be involved in severe casualty crashes compared with other vehicle types. The elimination of traffic violations is crucial to preventing severe casualty truck crashes. However, there is a lack of comprehensive analyses of truck violations and their conditions related to severe casualty crashes. This study aims to identify thematic communities of truck driver violations through a modeling framework integrating text mining and reliable network analysis.

Methods: This study collected 432 textual reports of severe truck casualty crashes in China from 2013 to 2020, which were divided into crash narratives and metadata for separate preprocessing. For the narrative part, the ELECTRA model was used for Chinese word segmentation and part-of-speech tagging, and keywords were extracted by combining with TF-IDF. The metadata was processed through named entity recognition, geocoding, etc., and then merged with the narrative keywords. Association rules were mined by the Apriori algorithm to construct a network with keywords as nodes and lift values as edge weights, which was visualized by the ForceAtlas2 algorithm. The Leiden algorithm was adopted to detect thematic communities, whose significance was validated by QStest.

Results: Text mining results reveal 77 most relevant keywords extracted from 432 police narratives. Overloading and speeding emerge as predominant traffic violations, correlating with 43% and 30% of severe casualty truck crashes, respectively. A total of four overloading and five speeding statistically significant thematic communities are identified. Notably, the circumstances associated with truck overloading and speeding manifest distinct characteristics. For overloading, conditions contributing to severe casualty crashes encompass rural highways with curves or slopes, provincial or national highways in the afternoon, expressways during nighttime, and locations proximate to signalized intersections. In contrast, five circumstances are linked to speeding: curved or sloped road segments during the afternoon, rural highways in autumn, straight road sections during the night, work zone areas on four-lane roadways, and un-signalized intersections on weekdays. Moreover, we also extracted vehicle and driver features across diverse environments, facilitating the identification of key elements for preventing severe casualty truck crashes. For instance, light trucks exhibit a higher susceptibility to severe casualty crashes attributed to overloading on rural highways.

Conclusions: This study demonstrates the advantages of textual data and reliable network analysis. Text data analysis proves to be more convenient, yielding a richer array of comprehensive information while demanding less subjective judgment. The findings of this paper inform consequent enforcement and engineering measures for mitigating severe casualty truck crashes.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Traffic Injury Prevention
Traffic Injury Prevention PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH-
CiteScore
3.60
自引率
10.00%
发文量
137
审稿时长
3 months
期刊介绍: The purpose of Traffic Injury Prevention is to bridge the disciplines of medicine, engineering, public health and traffic safety in order to foster the science of traffic injury prevention. The archival journal focuses on research, interventions and evaluations within the areas of traffic safety, crash causation, injury prevention and treatment. General topics within the journal''s scope are driver behavior, road infrastructure, emerging crash avoidance technologies, crash and injury epidemiology, alcohol and drugs, impact injury biomechanics, vehicle crashworthiness, occupant restraints, pedestrian safety, evaluation of interventions, economic consequences and emergency and clinical care with specific application to traffic injury prevention. The journal includes full length papers, review articles, case studies, brief technical notes and commentaries.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信