Melika Ansarinejad , Sherif M. Gaweesh , Mohamed M. Ahmed
{"title":"Assessing the efficacy of pre-trained large language models in analyzing autonomous vehicle field test disengagements","authors":"Melika Ansarinejad , Sherif M. Gaweesh , Mohamed M. Ahmed","doi":"10.1016/j.aap.2025.108178","DOIUrl":null,"url":null,"abstract":"<div><div>This study evaluates the efficacy of pre-trained large language models (LLMs) in analyzing disengagement reports of Levels 2–3 autonomous vehicle (AV) field tests, utilizing data provided from California Department of Motor Vehicles. Disengagement reports document instances where autonomous vehicles, tested under the Autonomous Vehicle Tester (AVT) and AVT Driverless Programs, transition from autonomous to manual control. These disengagements occur when human intervention is required due to incidents or limitations in the operational design domain that prevent AVs from functioning properly. Understanding factors leading to disengagements is pivotal for assessing AV performance and guiding infrastructure owners and operators (IOOs) about modifications needed. Manual approaches for analysis of the disengagement data are labor-intensive and prone to human error. Our research investigates the capability of LLMs to automate this analysis, focusing on identifying patterns, categorizing disengagement causes, and extracting meaningful insights from extensive datasets. GPT-4o as an LLM was employed to analyze the disengagement reports. The study aims to measure the accuracy, efficiency, and reliability of these models in comparison to traditional techniques. The application of LLMs demonstrated significant potential in identifying insights from the disengagement dataset, while effectively processing the textual data, achieving an accuracy of 87%. Several data limitations were encountered, including inconsistencies in disengagement descriptions from different manufacturers, which posed challenges to standardizing the analysis. Additionally, the disengagement reports offered limited details on the specific causes of disengagements and the surrounding conditions, restricting the depth of insights that could be drawn. Despite these challenges, our findings indicate that LLMs can substantially enhance the speed and precision of analyzing AV disengagement reports, offering valuable insights, while being cost-effective, that can inform further research and development in AV technology and safety protocols.</div></div>","PeriodicalId":6926,"journal":{"name":"Accident; analysis and prevention","volume":"220 ","pages":"Article 108178"},"PeriodicalIF":6.2000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accident; analysis and prevention","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0001457525002647","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ERGONOMICS","Score":null,"Total":0}
引用次数: 0
Abstract
This study evaluates the efficacy of pre-trained large language models (LLMs) in analyzing disengagement reports of Levels 2–3 autonomous vehicle (AV) field tests, utilizing data provided from California Department of Motor Vehicles. Disengagement reports document instances where autonomous vehicles, tested under the Autonomous Vehicle Tester (AVT) and AVT Driverless Programs, transition from autonomous to manual control. These disengagements occur when human intervention is required due to incidents or limitations in the operational design domain that prevent AVs from functioning properly. Understanding factors leading to disengagements is pivotal for assessing AV performance and guiding infrastructure owners and operators (IOOs) about modifications needed. Manual approaches for analysis of the disengagement data are labor-intensive and prone to human error. Our research investigates the capability of LLMs to automate this analysis, focusing on identifying patterns, categorizing disengagement causes, and extracting meaningful insights from extensive datasets. GPT-4o as an LLM was employed to analyze the disengagement reports. The study aims to measure the accuracy, efficiency, and reliability of these models in comparison to traditional techniques. The application of LLMs demonstrated significant potential in identifying insights from the disengagement dataset, while effectively processing the textual data, achieving an accuracy of 87%. Several data limitations were encountered, including inconsistencies in disengagement descriptions from different manufacturers, which posed challenges to standardizing the analysis. Additionally, the disengagement reports offered limited details on the specific causes of disengagements and the surrounding conditions, restricting the depth of insights that could be drawn. Despite these challenges, our findings indicate that LLMs can substantially enhance the speed and precision of analyzing AV disengagement reports, offering valuable insights, while being cost-effective, that can inform further research and development in AV technology and safety protocols.
期刊介绍:
Accident Analysis & Prevention provides wide coverage of the general areas relating to accidental injury and damage, including the pre-injury and immediate post-injury phases. Published papers deal with medical, legal, economic, educational, behavioral, theoretical or empirical aspects of transportation accidents, as well as with accidents at other sites. Selected topics within the scope of the Journal may include: studies of human, environmental and vehicular factors influencing the occurrence, type and severity of accidents and injury; the design, implementation and evaluation of countermeasures; biomechanics of impact and human tolerance limits to injury; modelling and statistical analysis of accident data; policy, planning and decision-making in safety.