Multimodal Large-Language Model Empowering Next-Generation Autonomous Driving Systems

IF 7.8

Journal of Intelligent and Connected Vehicles Pub Date : 2025-06-01 DOI:10.26599/JICV.2025.9210059

Zhiqiang Hu;Mingxing Xu;Qixiu Cheng

引用次数: 0

Abstract

Autonomous driving technology has made significant advancements in recent years. The evolution of autonomous driving systems from traditional modular designs to end-to-end learning paradigms has led to comprehensive improvements in driving capabilities. In modular designs, driving tasks are segmented into independent modules, such as perception, decision-making, planning, and control. This modular structure offers high explainability and safety in simple scenarios but is hindered by limited generalizability in complex traffic environments, and the sequential connection of multiple modules often leads to error accumulation. In contrast, end-to-end methods process perception data directly to produce control outputs, thereby mitigating information loss and sequential error accumulation, ultimately improving scene generalization in diverse environments. However, this approach is limited by strong data dependency, low interpretability, and inadequate handling of long-tail scenarios (Zhao et al., 2024).

查看原文本刊更多论文

支持下一代自动驾驶系统的多模态大语言模型

近年来，自动驾驶技术取得了重大进展。自动驾驶系统从传统的模块化设计向端到端学习范式的演变，导致了驾驶能力的全面提高。在模块化设计中，驾驶任务被分割成独立的模块，如感知、决策、规划和控制。这种模块化结构在简单场景下具有较高的可解释性和安全性，但在复杂交通环境下泛化能力有限，且多个模块的顺序连接往往导致错误积累。相比之下，端到端方法直接处理感知数据以产生控制输出，从而减少信息丢失和顺序误差积累，最终提高不同环境下的场景泛化。然而，这种方法受到数据依赖性强、可解释性低以及对长尾场景处理不足的限制（Zhao et al., 2024）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Intelligent and Connected Vehicles

CiteScore

7.10

自引率

0.00%

发文量