{"title":"Efficient 3D Surface Super-resolution via Normal-based Multimodal Restoration.","authors":"Miaohui Wang,Yunheng Liu,Wuyuan Xie,Boxin Shi,Jianmin Jiang","doi":"10.1109/tpami.2025.3614184","DOIUrl":"https://doi.org/10.1109/tpami.2025.3614184","url":null,"abstract":"High-fidelity 3D surface is essential for vision tasks across various domains such as medical imaging, cultural heritage preservation, quality inspection, virtual reality, and autonomous navigation. However, the intricate nature of 3D data representations poses significant challenges in restoring diverse 3D surfaces while capturing fine-grained geometric details at a low cost. This paper introduces an efficient multimodal normal-based 3D surface super-resolution (mn3DSSR) framework, designed to address the challenges of microgeometry enhancement and computational overhead. Specifically, we have constructed one of the largest normalbased multimodal dataset, ensuring superior data quality and diversity through meticulous subjective selection. Furthermore, we explore a new two-branch multimodal alignment approach along with a multimodal split fusion module to mitigate computational complexity while improving restoration performances. To address the limitations associated with normal-based multimodal learning, we develop novel normal-induced loss functions that facilitate geometric consistency and improve feature alignment. Extensive experiments conducted on seven benchmark datasets across four different 3D data representations demonstrate that mn3DSSR consistently outperforms state-ofthe-art super-resolution methods in terms of restoration accuracy with high computational efficiency.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"27 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145209327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peng Ye, Tong He, Baopu Li, Tao Chen, Lei Bai, Wanli Ouyang
{"title":"$beta$-DARTS++: Bi-level Regularization for Proxy-robust Differentiable Architecture Search","authors":"Peng Ye, Tong He, Baopu Li, Tao Chen, Lei Bai, Wanli Ouyang","doi":"10.1109/tpami.2025.3616249","DOIUrl":"https://doi.org/10.1109/tpami.2025.3616249","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"17 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145203304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hao Nan Sheng, Zhi-Yong Wang, Hing Cheung So, Abdelhak M. Zoubir
{"title":"Multi-Matrix Completion: A Novel Framework for Structurally Missing Elements","authors":"Hao Nan Sheng, Zhi-Yong Wang, Hing Cheung So, Abdelhak M. Zoubir","doi":"10.1109/tpami.2025.3616607","DOIUrl":"https://doi.org/10.1109/tpami.2025.3616607","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"37 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145203310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CAIT: Triple-Win Compression Towards High Accuracy, Fast Inference, and Favorable Transferability for ViTs","authors":"Ao Wang, Hui Chen, Zijia Lin, Sicheng Zhao, Jungong Han, Guiguang Ding","doi":"10.1109/tpami.2025.3616854","DOIUrl":"https://doi.org/10.1109/tpami.2025.3616854","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"9 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145203303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"To Fold or Not to Fold: Graph Regularized Tensor Train for Visual Data Completion","authors":"Le Xu, Lei Cheng, Ngai Wong, Yik-Chung Wu","doi":"10.1109/tpami.2025.3615445","DOIUrl":"https://doi.org/10.1109/tpami.2025.3615445","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"17 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145195448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun Dan,Yang Liu,Baigui Sun,Jiankang Deng,Shan Luo
{"title":"TransFace++: Rethinking the Face Recognition Paradigm with a Focus on Accuracy, Efficiency, and Security.","authors":"Jun Dan,Yang Liu,Baigui Sun,Jiankang Deng,Shan Luo","doi":"10.1109/tpami.2025.3616149","DOIUrl":"https://doi.org/10.1109/tpami.2025.3616149","url":null,"abstract":"Face Recognition (FR) technology has made significant strides with the emergence of deep learning. Typically, most existing FR models are built upon Convolutional Neural Networks (CNN) and take RGB face images as the model's input. In this work, we take a closer look at existing FR paradigms from high-efficiency, security, and precision perspectives, and identify the following three problems: (i) CNN frameworks are vulnerable in capturing global facial features and modeling the correlations between local facial features. (ii) Selecting RGB face images as the model's input greatly degrades the model's inference efficiency, increasing the extra computation costs. (iii) In the real-world FR system that operates on RGB face images, the integrity of user privacy may be compromised if hackers successfully penetrate and gain access to the input of this model. To solve these three issues, we propose two novel FR frameworks, i.e., TransFace and TransFace++, which successfully explore the feasibility of applying ViTs and image bytes to FR tasks, respectively. Firstly, as revealed from our observations, we find that ViTs perform vulnerably when applied to FR scenarios with extremely large datasets. We investigate the reasons for this phenomenon and discover that the existing data augmentation approaches and hard sample mining strategies are incompatible with ViTs-based FR backbone due to the lack of tailored consideration on preserving face structural information and leveraging each local token information. To remedy these problems, we first propose a superior FR model called TransFace, which contains a patch-level data augmentation strategy named Dominant Patch Amplitude Perturbation (DPAP) and a hard sample mining strategy named Entropy-guided Hard Sample Mining (EHSM). Furthermore, to improve inference efficiency and user privacy protection, we investigate the intrinsic property of image bytes and propose a superior FR model termed TransFace++. The proposed model is trained directly on image bytes, presenting a novel approach to address the aforementioned issues. Specifically, considering the importance of local correlations in bytes, an image bytes compression strategy named Topology-based Image Bytes Compression (TIBC) is introduced to extract prominent features from the raw bytes and integrate these features with byte embeddings, effectively mitigating information loss during the bytes mapping process. Moreover, to strengthen the model's perception on geometric information encoded in image bytes, a novel cross-attention module named Structure Information-guided Cross-Attention (SICA) is designed to inject structure information into byte tokens for information interaction, significantly improving the model's generalization ability. Experiments on popular face benchmarks demonstrate the superiority of our TransFace and TransFace++. Code is available at https://github.com/DanJun6737/TransFace_pp.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"7 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145194865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}