基于核心能力的自动驾驶感知基础模型研究

IF 4.8 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Open Journal of Vehicular Technology Pub Date : 2025-09-01 DOI:10.1109/OJVT.2025.3604823

Rajendramayavan Sathyam;Yueqi Li

{"title":"基于核心能力的自动驾驶感知基础模型研究","authors":"Rajendramayavan Sathyam;Yueqi Li","doi":"10.1109/OJVT.2025.3604823","DOIUrl":null,"url":null,"abstract":"Foundation models are revolutionizing autonomous driving perception, transitioning the field from narrow, task-specific deep learning models to versatile, general-purpose architectures trained on vast, diverse datasets. This survey examines how these models address critical challenges in autonomous perception, including limitations in generalization, scalability, and robustness to distributional shifts. The survey introduces a novel taxonomy structured around four essential capabilities for robust performance in dynamic driving environments: generalized knowledge, spatial understanding, multi-sensor robustness, and temporal reasoning. For each capability, the survey elucidates its significance and comprehensively reviews cutting-edge approaches. Diverging from traditional method-centric surveys, our unique framework prioritizes conceptual design principles, providing a capability-driven guide for model development and clearer insights into foundational aspects. We conclude by discussing key challenges, particularly those associated with the integration of these capabilities into real-time, scalable systems, and broader deployment challenges related to computational demands and ensuring model reliability against issues like hallucinations and out-of-distribution failures. The survey also outlines crucial future research directions to enable the safe and effective deployment of foundation models in autonomous driving systems.","PeriodicalId":34270,"journal":{"name":"IEEE Open Journal of Vehicular Technology","volume":"6 ","pages":"2554-2582"},"PeriodicalIF":4.8000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11146457","citationCount":"0","resultStr":"{\"title\":\"Foundation Models for Autonomous Driving Perception: A Survey Through Core Capabilities\",\"authors\":\"Rajendramayavan Sathyam;Yueqi Li\",\"doi\":\"10.1109/OJVT.2025.3604823\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Foundation models are revolutionizing autonomous driving perception, transitioning the field from narrow, task-specific deep learning models to versatile, general-purpose architectures trained on vast, diverse datasets. This survey examines how these models address critical challenges in autonomous perception, including limitations in generalization, scalability, and robustness to distributional shifts. The survey introduces a novel taxonomy structured around four essential capabilities for robust performance in dynamic driving environments: generalized knowledge, spatial understanding, multi-sensor robustness, and temporal reasoning. For each capability, the survey elucidates its significance and comprehensively reviews cutting-edge approaches. Diverging from traditional method-centric surveys, our unique framework prioritizes conceptual design principles, providing a capability-driven guide for model development and clearer insights into foundational aspects. We conclude by discussing key challenges, particularly those associated with the integration of these capabilities into real-time, scalable systems, and broader deployment challenges related to computational demands and ensuring model reliability against issues like hallucinations and out-of-distribution failures. The survey also outlines crucial future research directions to enable the safe and effective deployment of foundation models in autonomous driving systems.\",\"PeriodicalId\":34270,\"journal\":{\"name\":\"IEEE Open Journal of Vehicular Technology\",\"volume\":\"6 \",\"pages\":\"2554-2582\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11146457\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Open Journal of Vehicular Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11146457/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of Vehicular Technology","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11146457/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

基础模型正在彻底改变自动驾驶感知，将该领域从狭窄的、特定于任务的深度学习模型转变为在大量、不同的数据集上训练的通用、通用架构。本调查研究了这些模型如何解决自主感知中的关键挑战，包括泛化、可扩展性和对分布变化的鲁棒性的限制。该调查介绍了一种新的分类法，该分类法围绕动态驾驶环境中鲁棒性能的四个基本能力：广义知识、空间理解、多传感器鲁棒性和时间推理。对于每种能力，调查阐明了其重要性，并全面回顾了前沿方法。与传统的以方法为中心的调查不同，我们独特的框架优先考虑了概念设计原则，为模型开发提供了能力驱动的指导，并对基础方面提供了更清晰的见解。最后，我们讨论了关键的挑战，特别是那些与将这些功能集成到实时、可扩展的系统中相关的挑战，以及与计算需求相关的更广泛的部署挑战，并确保模型的可靠性，以应对幻觉和超出分布的故障等问题。该调查还概述了未来重要的研究方向，以便在自动驾驶系统中安全有效地部署基础模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Foundation Models for Autonomous Driving Perception: A Survey Through Core Capabilities

Foundation models are revolutionizing autonomous driving perception, transitioning the field from narrow, task-specific deep learning models to versatile, general-purpose architectures trained on vast, diverse datasets. This survey examines how these models address critical challenges in autonomous perception, including limitations in generalization, scalability, and robustness to distributional shifts. The survey introduces a novel taxonomy structured around four essential capabilities for robust performance in dynamic driving environments: generalized knowledge, spatial understanding, multi-sensor robustness, and temporal reasoning. For each capability, the survey elucidates its significance and comprehensively reviews cutting-edge approaches. Diverging from traditional method-centric surveys, our unique framework prioritizes conceptual design principles, providing a capability-driven guide for model development and clearer insights into foundational aspects. We conclude by discussing key challenges, particularly those associated with the integration of these capabilities into real-time, scalable systems, and broader deployment challenges related to computational demands and ensuring model reliability against issues like hallucinations and out-of-distribution failures. The survey also outlines crucial future research directions to enable the safe and effective deployment of foundation models in autonomous driving systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Open Journal of Vehicular Technology Multiple-

CiteScore

9.60

自引率

0.00%

发文量

审稿时长

10 weeks