{"title":"LeGEND: A Top-Down Approach to Scenario Generation of Autonomous Driving Systems Assisted by Large Language Models","authors":"Shuncheng Tang, Zhenya Zhang, Jixiang Zhou, Lei Lei, Yuan Zhou, Yinxing Xue","doi":"arxiv-2409.10066","DOIUrl":null,"url":null,"abstract":"Autonomous driving systems (ADS) are safety-critical and require\ncomprehensive testing before their deployment on public roads. While existing\ntesting approaches primarily aim at the criticality of scenarios, they often\noverlook the diversity of the generated scenarios that is also important to\nreflect system defects in different aspects. To bridge the gap, we propose\nLeGEND, that features a top-down fashion of scenario generation: it starts with\nabstract functional scenarios, and then steps downwards to logical and concrete\nscenarios, such that scenario diversity can be controlled at the functional\nlevel. However, unlike logical scenarios that can be formally described,\nfunctional scenarios are often documented in natural languages (e.g., accident\nreports) and thus cannot be precisely parsed and processed by computers. To\ntackle that issue, LeGEND leverages the recent advances of large language\nmodels (LLMs) to transform textual functional scenarios to formal logical\nscenarios. To mitigate the distraction of useless information in functional\nscenario description, we devise a two-phase transformation that features the\nuse of an intermediate language; consequently, we adopt two LLMs in LeGEND, one\nfor extracting information from functional scenarios, the other for converting\nthe extracted information to formal logical scenarios. We experimentally\nevaluate LeGEND on Apollo, an industry-grade ADS from Baidu. Evaluation results\nshow that LeGEND can effectively identify critical scenarios, and compared to\nbaseline approaches, LeGEND exhibits evident superiority in diversity of\ngenerated scenarios. Moreover, we also demonstrate the advantages of our\ntwo-phase transformation framework, and the accuracy of the adopted LLMs.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"93 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Autonomous driving systems (ADS) are safety-critical and require
comprehensive testing before their deployment on public roads. While existing
testing approaches primarily aim at the criticality of scenarios, they often
overlook the diversity of the generated scenarios that is also important to
reflect system defects in different aspects. To bridge the gap, we propose
LeGEND, that features a top-down fashion of scenario generation: it starts with
abstract functional scenarios, and then steps downwards to logical and concrete
scenarios, such that scenario diversity can be controlled at the functional
level. However, unlike logical scenarios that can be formally described,
functional scenarios are often documented in natural languages (e.g., accident
reports) and thus cannot be precisely parsed and processed by computers. To
tackle that issue, LeGEND leverages the recent advances of large language
models (LLMs) to transform textual functional scenarios to formal logical
scenarios. To mitigate the distraction of useless information in functional
scenario description, we devise a two-phase transformation that features the
use of an intermediate language; consequently, we adopt two LLMs in LeGEND, one
for extracting information from functional scenarios, the other for converting
the extracted information to formal logical scenarios. We experimentally
evaluate LeGEND on Apollo, an industry-grade ADS from Baidu. Evaluation results
show that LeGEND can effectively identify critical scenarios, and compared to
baseline approaches, LeGEND exhibits evident superiority in diversity of
generated scenarios. Moreover, we also demonstrate the advantages of our
two-phase transformation framework, and the accuracy of the adopted LLMs.