IEEE Transactions on Pattern Analysis and Machine Intelligence最新文献

筛选
英文 中文
End-to-End Autonomous Driving without Costly Modularization and 3D Manual Annotation 端到端自动驾驶,无需昂贵的模块化和3D手动标注
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-23 DOI: 10.1109/tpami.2025.3610517
Mingzhe Guo, Zhipeng Zhang, Yuan He, Ke Wang, Liping Jing, Haibin Ling
{"title":"End-to-End Autonomous Driving without Costly Modularization and 3D Manual Annotation","authors":"Mingzhe Guo, Zhipeng Zhang, Yuan He, Ke Wang, Liping Jing, Haibin Ling","doi":"10.1109/tpami.2025.3610517","DOIUrl":"https://doi.org/10.1109/tpami.2025.3610517","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"2 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145127455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic Concentration for Self-Supervised Dense Representations Learning 自监督密集表示学习的语义集中
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-23 DOI: 10.1109/tpami.2025.3609758
Peisong Wen, Qianqian Xu, Siran Dai, Runmin Cong, Qingming Huang
{"title":"Semantic Concentration for Self-Supervised Dense Representations Learning","authors":"Peisong Wen, Qianqian Xu, Siran Dai, Runmin Cong, Qingming Huang","doi":"10.1109/tpami.2025.3609758","DOIUrl":"https://doi.org/10.1109/tpami.2025.3609758","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"321 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145127466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-And Knowledge-Driven Visual Abductive Reasoning 数据和知识驱动的视觉溯因推理
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-23 DOI: 10.1109/tpami.2025.3613712
Chen Liang, Wenguan Wang, Ling Chen, Yi Yang
{"title":"Data-And Knowledge-Driven Visual Abductive Reasoning","authors":"Chen Liang, Wenguan Wang, Ling Chen, Yi Yang","doi":"10.1109/tpami.2025.3613712","DOIUrl":"https://doi.org/10.1109/tpami.2025.3613712","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"86 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145127459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Translating Images to Road Network: A Sequence-to-Sequence Perspective. 将图像转换为道路网络:一个序列到序列的视角。
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-23 DOI: 10.1109/tpami.2025.3612940
Jiachen Lu,Ming Nie,Bozhou Zhang,Renyuan Peng,Xinyue Cai,Hang Xu,Feng Wen,Wei Zhang,Li Zhang
{"title":"Translating Images to Road Network: A Sequence-to-Sequence Perspective.","authors":"Jiachen Lu,Ming Nie,Bozhou Zhang,Renyuan Peng,Xinyue Cai,Hang Xu,Feng Wen,Wei Zhang,Li Zhang","doi":"10.1109/tpami.2025.3612940","DOIUrl":"https://doi.org/10.1109/tpami.2025.3612940","url":null,"abstract":"The extraction of road network is essential for the generation of high-definition maps since it enables the precise localization of road landmarks and their interconnections. However, generating road network poses a significant challenge due to the conflicting underlying combination of Euclidean (e.g., road landmarks location) and non-Euclidean (e.g., road topological connectivity) structures. Existing methods struggle to merge the two types of data domains effectively, but few of them address it properly. Instead, our work establishes a unified representation of both types of data domain by projecting both Euclidean and non- Euclidean data into an integer series called RoadNet Sequence. Further than modeling an auto-regressive sequence-to-sequence Transformer model to understand RoadNet Sequence, we decouple the dependency of RoadNet Sequence into a mixture of autoregressive and non-autoregressive dependency. Building on this, our proposed non-autoregressive sequence-to-sequence approach leverages non-autoregressive dependencies while fixing the gap towards auto-regressive dependencies, resulting in success in both efficiency and accuracy. We further identify two main bottlenecks in the current RoadNetTransformer on a non-overfitting split of the dataset: poor landmark detection limited by the BEV Encoder and error propagation to topology reasoning. Therefore, we propose Topology-Inherited Training to inherit better topology knowledge into RoadNetTransformer. Additionally, we collect SD-Maps from open-source map datasets and use this prior information to significantly improve landmark detection and reachability. Extensive experiments on the nuScenes dataset demonstrate the superiority of RoadNet Sequence representation and the non-autoregressive approach compared to existing stateof- the-art alternatives. Our code is publicly available at opensource https://github.com/fudan-zvg/RoadNetworkTRansformer.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"22 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145127196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IPF-RDA: An Information-Preserving Framework for Robust Data Augmentation IPF-RDA:稳健数据扩充的信息保存框架
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-22 DOI: 10.1109/tpami.2025.3613005
Suorong Yang, Hongchao Yang, Suhan Guo, Furao Shen, Jian Zhao
{"title":"IPF-RDA: An Information-Preserving Framework for Robust Data Augmentation","authors":"Suorong Yang, Hongchao Yang, Suhan Guo, Furao Shen, Jian Zhao","doi":"10.1109/tpami.2025.3613005","DOIUrl":"https://doi.org/10.1109/tpami.2025.3613005","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"99 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145116265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lagrangian Motion Fields for Long-term Motion Generation 长期运动生成的拉格朗日运动场
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-22 DOI: 10.1109/tpami.2025.3612380
Yifei Yang, Zikai Huang, Chenshu Xu, Shengfeng He
{"title":"Lagrangian Motion Fields for Long-term Motion Generation","authors":"Yifei Yang, Zikai Huang, Chenshu Xu, Shengfeng He","doi":"10.1109/tpami.2025.3612380","DOIUrl":"https://doi.org/10.1109/tpami.2025.3612380","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"2 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145116231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MGAF: LiDAR-Camera 3D Object Detection with Multiple Guidance and Adaptive Fusion. MGAF:激光雷达-相机三维目标检测与多制导和自适应融合。
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-22 DOI: 10.1109/tpami.2025.3612958
Baojie Fan,Xiaotian Li,Yuhan Zhou,Caixia Xia,Huijie Fan,Fengyu Xu,Jiandong Tian
{"title":"MGAF: LiDAR-Camera 3D Object Detection with Multiple Guidance and Adaptive Fusion.","authors":"Baojie Fan,Xiaotian Li,Yuhan Zhou,Caixia Xia,Huijie Fan,Fengyu Xu,Jiandong Tian","doi":"10.1109/tpami.2025.3612958","DOIUrl":"https://doi.org/10.1109/tpami.2025.3612958","url":null,"abstract":"Recent years have witnessed the remarkable progress of 3D multi-modality object detection methods based on the Bird's-Eye-View (BEV) perspective. However, most of them overlook the complementary interaction and guidance between LiDAR and camera. In this work, we propose a novel multi-modality 3D objection detection method, with multi-guided global interaction and LiDAR-guided adaptive fusion, named MGAF. Specifically, we introduce sparse depth guidance (SDG) and LiDAR occupancy guidance (LOG) to generate 3D features with sufficient depth and spatial information. The designed semantic segmentation network captures category and orientation prior information for raw point clouds. In the following, an Adaptive Fusion Dual Transformer (AFDT) is developed to adaptively enhance the interaction of different modal BEV features from both global and bidirectional perspectives. Meanwhile, additional downsampling with sparse height compression and multi-scale dual-path transformer (MSDPT) are designed in order to enlarge the receptive fields of different modal features. Finally, a temporal fusion module is introduced to aggregate features from previous frames. Notably, the proposed AFDT is general, which also shows superior performance on other models. Our framework has undergone extensive experimentation on the large-scale nuScenes dataset, Waymo Open Dataset, and long-range Argoverse2 dataset, consistently demonstrating state-of-the-art performance. The code will be released at:https://github.com/xioatian1/MGAF. 3D object detection, multi-modality, multiple guidance, adaptive fusion, BEV representation, autonomous driving.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"51 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145117074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Optimal Mixture of Experts System for 3D Object Detection: A Game of Accuracy, Efficiency and Adaptivity. 面向三维目标检测的最优混合专家系统:精度、效率和适应性的博弈。
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-22 DOI: 10.1109/tpami.2025.3611795
Linshen Liu,Pu Wang,Guanlin Wu,Junyue Jiang,Hao Yang
{"title":"Towards Optimal Mixture of Experts System for 3D Object Detection: A Game of Accuracy, Efficiency and Adaptivity.","authors":"Linshen Liu,Pu Wang,Guanlin Wu,Junyue Jiang,Hao Yang","doi":"10.1109/tpami.2025.3611795","DOIUrl":"https://doi.org/10.1109/tpami.2025.3611795","url":null,"abstract":"Autonomous vehicles, open-world robots, and other automated systems rely on accurate, efficient perception modules for real-time object detection. Although high-precision models improve reliability, their processing time and computational overhead can hinder real-time performance and raise safety concerns. This paper introduces an Edge-based Mixture-of-Experts Optimal Sensing (EMOS) System that addresses the challenge of co-achieving accuracy, latency and scene adaptivity, further demonstrated in the open-world autonomous driving scenarios. Algorithmically, EMOS fuses multimodal sensor streams via an Adaptive Multimodal Data Bridge and uses a scenario-aware MoE switch to activate only a complementary set of specialized experts as needed. The proposed hierarchical backpropagation and a multiscale pooling layer let model capacity scale with real-world demand complexity. System-wise, an edge-optimized runtime with accelerator-aware scheduling (e.g., ONNX/TensorRT), zero-copy buffering, and overlapped I/O-compute enforces explicit latency/accuracy budgets across diverse driving conditions. Experimental results establish EMOS as the new state of the art: on KITTI, it increases average AP by 3.17% while running $2.6times$ faster on Nvidia Jetson. On nuScenes, it improves accuracy by 0.2% mAP and 0.5% NDS, with 34% fewer parameters and a $15.35times$ Nvidia Jetson speedup. Leveraging multimodal data and intelligent experts cooperation, EMOS delivers accurate, efficient and edge-adaptive perception system for autonomous vehicles, thereby ensuring robust, timely responses in real-world scenarios.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"87 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145117071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ADA-Track++: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association ada - track++:端到端多相机3D多目标跟踪交替检测和关联
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-22 DOI: 10.1109/tpami.2025.3613269
Shuxiao Ding, Lukas Schneider, Marius Cordts, Juergen Gall
{"title":"ADA-Track++: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association","authors":"Shuxiao Ding, Lukas Schneider, Marius Cordts, Juergen Gall","doi":"10.1109/tpami.2025.3613269","DOIUrl":"https://doi.org/10.1109/tpami.2025.3613269","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"41 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145116227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models 基于预训练基础模型的低阶专家零弹稀疏混合构建
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-22 DOI: 10.1109/tpami.2025.3612480
Anke Tang, Li Shen, Yong Luo, Shuai Xie, Han Hu, Lefei Zhang, Bo Du, Dacheng Tao
{"title":"Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models","authors":"Anke Tang, Li Shen, Yong Luo, Shuai Xie, Han Hu, Lefei Zhang, Bo Du, Dacheng Tao","doi":"10.1109/tpami.2025.3612480","DOIUrl":"https://doi.org/10.1109/tpami.2025.3612480","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"18 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145116229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信