IEEE Transactions on Pattern Analysis and Machine Intelligence最新文献

筛选
英文 中文
LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving LargeAD:自动驾驶的大规模跨传感器数据预训练
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-10-02 DOI: 10.1109/tpami.2025.3617126
Lingdong Kong, Xiang Xu, Youquan Liu, Jun Cen, Runnan Chen, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu
{"title":"LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving","authors":"Lingdong Kong, Xiang Xu, Youquan Liu, Jun Cen, Runnan Chen, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu","doi":"10.1109/tpami.2025.3617126","DOIUrl":"https://doi.org/10.1109/tpami.2025.3617126","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"100 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145209354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient 3D Surface Super-resolution via Normal-based Multimodal Restoration. 通过基于法线的多模态恢复,高效的3D表面超分辨率。
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-10-02 DOI: 10.1109/tpami.2025.3614184
Miaohui Wang,Yunheng Liu,Wuyuan Xie,Boxin Shi,Jianmin Jiang
{"title":"Efficient 3D Surface Super-resolution via Normal-based Multimodal Restoration.","authors":"Miaohui Wang,Yunheng Liu,Wuyuan Xie,Boxin Shi,Jianmin Jiang","doi":"10.1109/tpami.2025.3614184","DOIUrl":"https://doi.org/10.1109/tpami.2025.3614184","url":null,"abstract":"High-fidelity 3D surface is essential for vision tasks across various domains such as medical imaging, cultural heritage preservation, quality inspection, virtual reality, and autonomous navigation. However, the intricate nature of 3D data representations poses significant challenges in restoring diverse 3D surfaces while capturing fine-grained geometric details at a low cost. This paper introduces an efficient multimodal normal-based 3D surface super-resolution (mn3DSSR) framework, designed to address the challenges of microgeometry enhancement and computational overhead. Specifically, we have constructed one of the largest normalbased multimodal dataset, ensuring superior data quality and diversity through meticulous subjective selection. Furthermore, we explore a new two-branch multimodal alignment approach along with a multimodal split fusion module to mitigate computational complexity while improving restoration performances. To address the limitations associated with normal-based multimodal learning, we develop novel normal-induced loss functions that facilitate geometric consistency and improve feature alignment. Extensive experiments conducted on seven benchmark datasets across four different 3D data representations demonstrate that mn3DSSR consistently outperforms state-ofthe-art super-resolution methods in terms of restoration accuracy with high computational efficiency.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"27 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145209327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
$beta$-DARTS++: Bi-level Regularization for Proxy-robust Differentiable Architecture Search $beta$- darts++:代理鲁棒可微架构搜索的双级别正则化
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-10-01 DOI: 10.1109/tpami.2025.3616249
Peng Ye, Tong He, Baopu Li, Tao Chen, Lei Bai, Wanli Ouyang
{"title":"$beta$-DARTS++: Bi-level Regularization for Proxy-robust Differentiable Architecture Search","authors":"Peng Ye, Tong He, Baopu Li, Tao Chen, Lei Bai, Wanli Ouyang","doi":"10.1109/tpami.2025.3616249","DOIUrl":"https://doi.org/10.1109/tpami.2025.3616249","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"17 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145203304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Matrix Completion: A Novel Framework for Structurally Missing Elements 多矩阵补全:一种结构缺失元素的新框架
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-10-01 DOI: 10.1109/tpami.2025.3616607
Hao Nan Sheng, Zhi-Yong Wang, Hing Cheung So, Abdelhak M. Zoubir
{"title":"Multi-Matrix Completion: A Novel Framework for Structurally Missing Elements","authors":"Hao Nan Sheng, Zhi-Yong Wang, Hing Cheung So, Abdelhak M. Zoubir","doi":"10.1109/tpami.2025.3616607","DOIUrl":"https://doi.org/10.1109/tpami.2025.3616607","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"37 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145203310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ID-Guard: A Universal Framework for Combating Facial Manipulation via Breaking Identification ID-Guard:一个通过打破身份来对抗面部操纵的通用框架
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-10-01 DOI: 10.1109/tpami.2025.3616232
Zuomin Qu, Wei Lu, Xiangyang Luo, Qian Wang, Xiaochun Cao
{"title":"ID-Guard: A Universal Framework for Combating Facial Manipulation via Breaking Identification","authors":"Zuomin Qu, Wei Lu, Xiangyang Luo, Qian Wang, Xiaochun Cao","doi":"10.1109/tpami.2025.3616232","DOIUrl":"https://doi.org/10.1109/tpami.2025.3616232","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"29 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145203305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CAIT: Triple-Win Compression Towards High Accuracy, Fast Inference, and Favorable Transferability for ViTs CAIT:三赢压缩实现高精度、快速推理和有利的vit可移植性
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-10-01 DOI: 10.1109/tpami.2025.3616854
Ao Wang, Hui Chen, Zijia Lin, Sicheng Zhao, Jungong Han, Guiguang Ding
{"title":"CAIT: Triple-Win Compression Towards High Accuracy, Fast Inference, and Favorable Transferability for ViTs","authors":"Ao Wang, Hui Chen, Zijia Lin, Sicheng Zhao, Jungong Han, Guiguang Ding","doi":"10.1109/tpami.2025.3616854","DOIUrl":"https://doi.org/10.1109/tpami.2025.3616854","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"9 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145203303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DTL: Parameter- and Memory-Efficient Disentangled Vision Learning DTL:参数和记忆效率的解纠缠视觉学习
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-10-01 DOI: 10.1109/tpami.2025.3616318
Minghao Fu, Ke Zhu, Zonghao Ding, Jianxin Wu
{"title":"DTL: Parameter- and Memory-Efficient Disentangled Vision Learning","authors":"Minghao Fu, Ke Zhu, Zonghao Ding, Jianxin Wu","doi":"10.1109/tpami.2025.3616318","DOIUrl":"https://doi.org/10.1109/tpami.2025.3616318","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"6 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145203306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
To Fold or Not to Fold: Graph Regularized Tensor Train for Visual Data Completion 折叠还是不折叠:用于视觉数据补全的图正则化张量训练
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-30 DOI: 10.1109/tpami.2025.3615445
Le Xu, Lei Cheng, Ngai Wong, Yik-Chung Wu
{"title":"To Fold or Not to Fold: Graph Regularized Tensor Train for Visual Data Completion","authors":"Le Xu, Lei Cheng, Ngai Wong, Yik-Chung Wu","doi":"10.1109/tpami.2025.3615445","DOIUrl":"https://doi.org/10.1109/tpami.2025.3615445","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"17 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145195448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TransFace++: Rethinking the Face Recognition Paradigm with a Focus on Accuracy, Efficiency, and Security. transface++:重新思考人脸识别范式,重点是准确性,效率和安全性。
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-30 DOI: 10.1109/tpami.2025.3616149
Jun Dan,Yang Liu,Baigui Sun,Jiankang Deng,Shan Luo
{"title":"TransFace++: Rethinking the Face Recognition Paradigm with a Focus on Accuracy, Efficiency, and Security.","authors":"Jun Dan,Yang Liu,Baigui Sun,Jiankang Deng,Shan Luo","doi":"10.1109/tpami.2025.3616149","DOIUrl":"https://doi.org/10.1109/tpami.2025.3616149","url":null,"abstract":"Face Recognition (FR) technology has made significant strides with the emergence of deep learning. Typically, most existing FR models are built upon Convolutional Neural Networks (CNN) and take RGB face images as the model's input. In this work, we take a closer look at existing FR paradigms from high-efficiency, security, and precision perspectives, and identify the following three problems: (i) CNN frameworks are vulnerable in capturing global facial features and modeling the correlations between local facial features. (ii) Selecting RGB face images as the model's input greatly degrades the model's inference efficiency, increasing the extra computation costs. (iii) In the real-world FR system that operates on RGB face images, the integrity of user privacy may be compromised if hackers successfully penetrate and gain access to the input of this model. To solve these three issues, we propose two novel FR frameworks, i.e., TransFace and TransFace++, which successfully explore the feasibility of applying ViTs and image bytes to FR tasks, respectively. Firstly, as revealed from our observations, we find that ViTs perform vulnerably when applied to FR scenarios with extremely large datasets. We investigate the reasons for this phenomenon and discover that the existing data augmentation approaches and hard sample mining strategies are incompatible with ViTs-based FR backbone due to the lack of tailored consideration on preserving face structural information and leveraging each local token information. To remedy these problems, we first propose a superior FR model called TransFace, which contains a patch-level data augmentation strategy named Dominant Patch Amplitude Perturbation (DPAP) and a hard sample mining strategy named Entropy-guided Hard Sample Mining (EHSM). Furthermore, to improve inference efficiency and user privacy protection, we investigate the intrinsic property of image bytes and propose a superior FR model termed TransFace++. The proposed model is trained directly on image bytes, presenting a novel approach to address the aforementioned issues. Specifically, considering the importance of local correlations in bytes, an image bytes compression strategy named Topology-based Image Bytes Compression (TIBC) is introduced to extract prominent features from the raw bytes and integrate these features with byte embeddings, effectively mitigating information loss during the bytes mapping process. Moreover, to strengthen the model's perception on geometric information encoded in image bytes, a novel cross-attention module named Structure Information-guided Cross-Attention (SICA) is designed to inject structure information into byte tokens for information interaction, significantly improving the model's generalization ability. Experiments on popular face benchmarks demonstrate the superiority of our TransFace and TransFace++. Code is available at https://github.com/DanJun6737/TransFace_pp.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"7 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145194865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic-Assisted Object Clustering for Multi-Modal Referring Video Segmentation 多模态参考视频分割的语义辅助目标聚类
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-29 DOI: 10.1109/tpami.2025.3612474
Yong Liu, Zhuoyan Luo, Yicheng Xiao, Yitong Wang, Shuyan Li, Xiu Li, Yujiu Yang, Yansong Tang
{"title":"Semantic-Assisted Object Clustering for Multi-Modal Referring Video Segmentation","authors":"Yong Liu, Zhuoyan Luo, Yicheng Xiao, Yitong Wang, Shuyan Li, Xiu Li, Yujiu Yang, Yansong Tang","doi":"10.1109/tpami.2025.3612474","DOIUrl":"https://doi.org/10.1109/tpami.2025.3612474","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"105 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145188422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信