LeGEND: A Top-Down Approach to Scenario Generation of Autonomous Driving Systems Assisted by Large Language Models

arXiv - CS - Software Engineering Pub Date : 2024-09-16 DOI:arxiv-2409.10066

Shuncheng Tang, Zhenya Zhang, Jixiang Zhou, Lei Lei, Yuan Zhou, Yinxing Xue

{"title":"LeGEND: A Top-Down Approach to Scenario Generation of Autonomous Driving Systems Assisted by Large Language Models","authors":"Shuncheng Tang, Zhenya Zhang, Jixiang Zhou, Lei Lei, Yuan Zhou, Yinxing Xue","doi":"arxiv-2409.10066","DOIUrl":null,"url":null,"abstract":"Autonomous driving systems (ADS) are safety-critical and require\ncomprehensive testing before their deployment on public roads. While existing\ntesting approaches primarily aim at the criticality of scenarios, they often\noverlook the diversity of the generated scenarios that is also important to\nreflect system defects in different aspects. To bridge the gap, we propose\nLeGEND, that features a top-down fashion of scenario generation: it starts with\nabstract functional scenarios, and then steps downwards to logical and concrete\nscenarios, such that scenario diversity can be controlled at the functional\nlevel. However, unlike logical scenarios that can be formally described,\nfunctional scenarios are often documented in natural languages (e.g., accident\nreports) and thus cannot be precisely parsed and processed by computers. To\ntackle that issue, LeGEND leverages the recent advances of large language\nmodels (LLMs) to transform textual functional scenarios to formal logical\nscenarios. To mitigate the distraction of useless information in functional\nscenario description, we devise a two-phase transformation that features the\nuse of an intermediate language; consequently, we adopt two LLMs in LeGEND, one\nfor extracting information from functional scenarios, the other for converting\nthe extracted information to formal logical scenarios. We experimentally\nevaluate LeGEND on Apollo, an industry-grade ADS from Baidu. Evaluation results\nshow that LeGEND can effectively identify critical scenarios, and compared to\nbaseline approaches, LeGEND exhibits evident superiority in diversity of\ngenerated scenarios. Moreover, we also demonstrate the advantages of our\ntwo-phase transformation framework, and the accuracy of the adopted LLMs.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"93 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Autonomous driving systems (ADS) are safety-critical and require comprehensive testing before their deployment on public roads. While existing testing approaches primarily aim at the criticality of scenarios, they often overlook the diversity of the generated scenarios that is also important to reflect system defects in different aspects. To bridge the gap, we propose LeGEND, that features a top-down fashion of scenario generation: it starts with abstract functional scenarios, and then steps downwards to logical and concrete scenarios, such that scenario diversity can be controlled at the functional level. However, unlike logical scenarios that can be formally described, functional scenarios are often documented in natural languages (e.g., accident reports) and thus cannot be precisely parsed and processed by computers. To tackle that issue, LeGEND leverages the recent advances of large language models (LLMs) to transform textual functional scenarios to formal logical scenarios. To mitigate the distraction of useless information in functional scenario description, we devise a two-phase transformation that features the use of an intermediate language; consequently, we adopt two LLMs in LeGEND, one for extracting information from functional scenarios, the other for converting the extracted information to formal logical scenarios. We experimentally evaluate LeGEND on Apollo, an industry-grade ADS from Baidu. Evaluation results show that LeGEND can effectively identify critical scenarios, and compared to baseline approaches, LeGEND exhibits evident superiority in diversity of generated scenarios. Moreover, we also demonstrate the advantages of our two-phase transformation framework, and the accuracy of the adopted LLMs.

查看原文本刊更多论文

LeGEND：在大型语言模型辅助下自上而下生成自动驾驶系统场景的方法

自动驾驶系统（ADS）对安全至关重要，在公共道路上部署前需要进行全面测试。虽然现有的测试方法主要针对场景的关键性，但它们往往忽略了生成场景的多样性，而这种多样性对于反映不同方面的系统缺陷也很重要。为了弥补这一差距，我们提出了LeGEND，它采用自上而下的情景生成方式：从抽象的功能情景开始，然后逐步向下生成逻辑情景和具体情景，这样就可以在功能层面控制情景多样性。然而，与可以正式描述的逻辑情景不同，功能情景通常是用自然语言（如事故报告）记录的，因此无法由计算机进行精确解析和处理。为了解决这个问题，LeGEND 利用大型语言模型（LLM）的最新进展，将文本功能场景转换为正式的逻辑场景。为了减少功能场景描述中无用信息的干扰，我们设计了一种以使用中间语言为特点的两阶段转换；因此，我们在 LeGEND 中采用了两种 LLM，一种用于从功能场景中提取信息，另一种用于将提取的信息转换为正式逻辑场景。我们在百度的行业级 ADS Apollo 上对 LeGEND 进行了实验评估。评估结果表明，LeGEND 可以有效识别关键场景，与基准方法相比，LeGEND 在生成场景的多样性方面表现出明显的优势。此外，我们还展示了两阶段转换框架的优势以及所采用的 LLM 的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Software Engineering

自引率

0.00%

发文量