Behavioral insights and hotspot identification: Integrating natural language processing, machine learning and geospatial analyses of cyclist crashes

IF 3.5 2区 工程技术 Q1 PSYCHOLOGY, APPLIED
Vinayak Malaghan , Francesco Pilla , Pavlos Tafidis , Brian Rogers
{"title":"Behavioral insights and hotspot identification: Integrating natural language processing, machine learning and geospatial analyses of cyclist crashes","authors":"Vinayak Malaghan ,&nbsp;Francesco Pilla ,&nbsp;Pavlos Tafidis ,&nbsp;Brian Rogers","doi":"10.1016/j.trf.2025.05.005","DOIUrl":null,"url":null,"abstract":"<div><div>In response to the rising trend in the promotion and adoption of cycling, ensuring cyclist safety is paramount. Understanding behavioural causes of crashes and identifying collision hotspots is important; however, the efforts are hindered by underreporting and limited data on all types of incidents, including near misses. Addressing these challenges, this study analyses text data reported on dedicated active travel collision platforms to categorize incidents and uncover behavioural patterns contributing to collisions. The reported text data is grouped into distinct themes applying Term Frequency-Inverse Document Frequency (TF-IDF) vectorization, and clustering. Additionally, the advanced geospatial technique Getis-Ord Gi* statistic is computed to identify spatial clustering of collisions and categorize geographical regions as hotspots and cold spots. Key themes contributing to collisions are grouped as follows: ‘close pass incidents,’ ‘blocked bicycle lanes,’ ‘cyclist incidents on tram tracks,’ ‘roundabout incidents,’ ‘left turn incidents,’ ‘incidents between buses and cyclists,’ ‘incidents involving cyclists and trucks,’ ‘incidents related to traffic lights and pedestrian crossings,’ and ‘turning incidents at intersections.’ Moreover, the hotspots from these incidents are located at or near the intersections of regional roads in the Central Business District (CBD) and on the peripheral regional roads encapsulating the CBD in Dublin, Ireland. This study advances the state of the art by utilizing an alternative data source, ‘crash descriptions’ from cyclist crashes, through the application of innovative machine learning techniques and advanced geospatial analyses. The insights from the unique themes and identified hotspots enhance understanding of risky behaviours and their spatial distribution, contributing to ongoing efforts to foster a safer cycling environment.</div></div>","PeriodicalId":48355,"journal":{"name":"Transportation Research Part F-Traffic Psychology and Behaviour","volume":"113 ","pages":"Pages 452-480"},"PeriodicalIF":3.5000,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part F-Traffic Psychology and Behaviour","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S136984782500169X","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, APPLIED","Score":null,"Total":0}
引用次数: 0

Abstract

In response to the rising trend in the promotion and adoption of cycling, ensuring cyclist safety is paramount. Understanding behavioural causes of crashes and identifying collision hotspots is important; however, the efforts are hindered by underreporting and limited data on all types of incidents, including near misses. Addressing these challenges, this study analyses text data reported on dedicated active travel collision platforms to categorize incidents and uncover behavioural patterns contributing to collisions. The reported text data is grouped into distinct themes applying Term Frequency-Inverse Document Frequency (TF-IDF) vectorization, and clustering. Additionally, the advanced geospatial technique Getis-Ord Gi* statistic is computed to identify spatial clustering of collisions and categorize geographical regions as hotspots and cold spots. Key themes contributing to collisions are grouped as follows: ‘close pass incidents,’ ‘blocked bicycle lanes,’ ‘cyclist incidents on tram tracks,’ ‘roundabout incidents,’ ‘left turn incidents,’ ‘incidents between buses and cyclists,’ ‘incidents involving cyclists and trucks,’ ‘incidents related to traffic lights and pedestrian crossings,’ and ‘turning incidents at intersections.’ Moreover, the hotspots from these incidents are located at or near the intersections of regional roads in the Central Business District (CBD) and on the peripheral regional roads encapsulating the CBD in Dublin, Ireland. This study advances the state of the art by utilizing an alternative data source, ‘crash descriptions’ from cyclist crashes, through the application of innovative machine learning techniques and advanced geospatial analyses. The insights from the unique themes and identified hotspots enhance understanding of risky behaviours and their spatial distribution, contributing to ongoing efforts to foster a safer cycling environment.
行为洞察和热点识别:整合自然语言处理、机器学习和自行车碰撞的地理空间分析
在推广和采用骑自行车的趋势日益上升的情况下,确保骑车人的安全是至关重要的。了解碰撞的行为原因和确定碰撞热点是很重要的;然而,由于所有类型的事件,包括未遂事件的数据少报和有限,这些努力受到阻碍。为了应对这些挑战,本研究分析了专用主动旅行碰撞平台上报告的文本数据,对事故进行分类,并揭示导致碰撞的行为模式。报告的文本数据应用词频率-逆文档频率(TF-IDF)向量化和聚类被分组为不同的主题。此外,计算先进的地理空间技术Getis-Ord Gi*统计量,识别碰撞的空间聚类,并将地理区域划分为热点和冷点。导致碰撞的主要主题分为以下几类:“近距离过路事故”、“自行车道堵塞”、“电车轨道上的骑自行车事故”、“环岛事故”、“左转事故”、“公共汽车与骑自行车者之间的事故”、“骑自行车者与卡车的事故”、“与交通灯和人行横道有关的事故”和“十字路口转弯事故”。此外,这些事件的热点位于爱尔兰都柏林中央商务区(CBD)区域道路的十字路口或附近,以及包围CBD的外围区域道路。本研究通过应用创新的机器学习技术和先进的地理空间分析,利用另一种数据源,即自行车碰撞的“碰撞描述”,推进了目前的技术水平。从独特的主题和确定的热点中获得的见解增强了对危险行为及其空间分布的理解,有助于持续努力营造更安全的骑行环境。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.60
自引率
14.60%
发文量
239
审稿时长
71 days
期刊介绍: Transportation Research Part F: Traffic Psychology and Behaviour focuses on the behavioural and psychological aspects of traffic and transport. The aim of the journal is to enhance theory development, improve the quality of empirical studies and to stimulate the application of research findings in practice. TRF provides a focus and a means of communication for the considerable amount of research activities that are now being carried out in this field. The journal provides a forum for transportation researchers, psychologists, ergonomists, engineers and policy-makers with an interest in traffic and transport psychology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信