Huan Qiu , Jian Zhou , Bijun Li , Qin Zou , Youchen Tang , Man Luo
{"title":"Map4comm: A map-aware collaborative perception framework with efficient-bandwidth information fusion","authors":"Huan Qiu , Jian Zhou , Bijun Li , Qin Zou , Youchen Tang , Man Luo","doi":"10.1016/j.inffus.2025.103567","DOIUrl":null,"url":null,"abstract":"<div><div>V2I (Vehicle-to-Infrastructure) collaborative perception enhances the ability to perceive dynamic driving environments by sharing multi-viewpoint information from the same scene through communication, gradually becoming an essential part of intelligent transportation systems. However, it inevitably introduces an inherent trade-off between communication bandwidth and perception performance. To address this bottleneck, we introduce a map-mask precisely aligned with perceptual spatial features. This mask can accurately filter out the background of the real-time perceptual feature information so as to selectively extract the perceptually critical areas as communication content. Based on this novel map-mask, we propose Map4comm, a unified map-aware collaborative perception framework, to achieve an efficient balance between communication bandwidth and perception performance. In order to save communication bandwidth, Map4comm introduces a Local Communication Area Selection (LCAS) mechanism based on map-mask to optimize the communication area selection of the system. In terms of performance, Map4comm presents an Adaptive Covoxel Feature Alignment (ACFA) strategy to achieve coarse alignment of vehicle–infrastructure-map heterogeneous low-dimensional voxel features, which in turn improves the overall perceptual performance. Based on these two approaches, Map4comm realizes an efficient trade-off between communication bandwidth and perception performance. To evaluate Map4comm, we conducted mapping and testing on the large-scale vehicle–infrastructure collaborative sequential perception dataset V2X-Seq-SPD. The experimental results show that Map4comm outperforms all other collaborative perception methods in terms of perceptual performance while realizing the least communication transmission cost compared to the state-of-the-art collaborative perception methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"126 ","pages":"Article 103567"},"PeriodicalIF":15.5000,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525006396","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
V2I (Vehicle-to-Infrastructure) collaborative perception enhances the ability to perceive dynamic driving environments by sharing multi-viewpoint information from the same scene through communication, gradually becoming an essential part of intelligent transportation systems. However, it inevitably introduces an inherent trade-off between communication bandwidth and perception performance. To address this bottleneck, we introduce a map-mask precisely aligned with perceptual spatial features. This mask can accurately filter out the background of the real-time perceptual feature information so as to selectively extract the perceptually critical areas as communication content. Based on this novel map-mask, we propose Map4comm, a unified map-aware collaborative perception framework, to achieve an efficient balance between communication bandwidth and perception performance. In order to save communication bandwidth, Map4comm introduces a Local Communication Area Selection (LCAS) mechanism based on map-mask to optimize the communication area selection of the system. In terms of performance, Map4comm presents an Adaptive Covoxel Feature Alignment (ACFA) strategy to achieve coarse alignment of vehicle–infrastructure-map heterogeneous low-dimensional voxel features, which in turn improves the overall perceptual performance. Based on these two approaches, Map4comm realizes an efficient trade-off between communication bandwidth and perception performance. To evaluate Map4comm, we conducted mapping and testing on the large-scale vehicle–infrastructure collaborative sequential perception dataset V2X-Seq-SPD. The experimental results show that Map4comm outperforms all other collaborative perception methods in terms of perceptual performance while realizing the least communication transmission cost compared to the state-of-the-art collaborative perception methods.
V2I (Vehicle-to-Infrastructure,车对基础设施)协同感知通过通信共享来自同一场景的多视点信息,增强了对动态驾驶环境的感知能力,逐渐成为智能交通系统的重要组成部分。然而,它不可避免地引入了通信带宽和感知性能之间的内在权衡。为了解决这一瓶颈,我们引入了与感知空间特征精确对齐的地图掩码。该掩码能够准确滤除实时感知特征信息的背景,从而有选择地提取感知关键区域作为传播内容。在此基础上,我们提出了统一的地图感知协同感知框架Map4comm,实现了通信带宽和感知性能之间的有效平衡。为了节省通信带宽,Map4comm引入了基于map-mask的Local communication Area Selection (LCAS)机制来优化系统的通信区域选择。在性能方面,Map4comm提出了一种自适应Covoxel Feature Alignment (ACFA)策略,实现了车辆-基础设施-地图异构低维体素特征的粗对齐,从而提高了整体感知性能。基于这两种方法,Map4comm实现了通信带宽和感知性能之间的有效权衡。为了评估Map4comm,我们在大规模车辆-基础设施协同顺序感知数据集V2X-Seq-SPD上进行了映射和测试。实验结果表明,Map4comm在感知性能方面优于所有其他协同感知方法,同时与最先进的协同感知方法相比,实现了最小的通信传输成本。
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.