IEEE transactions on pattern analysis and machine intelligence最新文献

筛选
英文 中文
Surfel-based Gaussian Inverse Rendering for Fast and Relightable Dynamic Human Reconstruction from Monocular Videos. 基于surf的高斯反渲染,用于单目视频中快速、可修饰的动态人体重建。
IF 18.6
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-08-14 DOI: 10.1109/TPAMI.2025.3599415
Yiqun Zhao, Chenming Wu, Binbin Huang, Yihao Zhi, Chen Zhao, Jingdong Wang, Shenghua Gao
{"title":"Surfel-based Gaussian Inverse Rendering for Fast and Relightable Dynamic Human Reconstruction from Monocular Videos.","authors":"Yiqun Zhao, Chenming Wu, Binbin Huang, Yihao Zhi, Chen Zhao, Jingdong Wang, Shenghua Gao","doi":"10.1109/TPAMI.2025.3599415","DOIUrl":"https://doi.org/10.1109/TPAMI.2025.3599415","url":null,"abstract":"<p><p>Efficient and accurate reconstruction of a relightable, dynamic clothed human avatar from a monocular video is crucial for the entertainment industry. This paper presents SGIA (Surfel-based Gaussian Inverse Avatar), which introduces efficient training and rendering for relightable dynamic human reconstruction. SGIA advances previous Gaussian Avatar methods by comprehensively modeling Physically-Based Rendering (PBR) properties for clothed human avatars, allowing for the manipulation of avatars into novel poses under diverse lighting conditions. Specifically, our approach integrates pre-integration and image-based lighting for fast light calculations that surpass the performance of existing implicit-based techniques. To address challenges related to material lighting disentanglement and accurate geometry reconstruction, we propose an innovative occlusion approximation strategy and a progressive training approach. Extensive experiments demonstrate that SGIA not only achieves highly accurate physical properties but also significantly enhances the realistic relighting of dynamic human avatars, providing a substantial speed advantage. We exhibit more results in our project page: https://GS-IA.github.io.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144857254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scaling up Multimodal Pre-Training for Sign Language Understanding. 扩大手语理解的多模态预训练。
IF 18.6
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-08-14 DOI: 10.1109/TPAMI.2025.3599313
Wengang Zhou, Weichao Zhao, Hezhen Hu, Zecheng Li, Houqiang Li
{"title":"Scaling up Multimodal Pre-Training for Sign Language Understanding.","authors":"Wengang Zhou, Weichao Zhao, Hezhen Hu, Zecheng Li, Houqiang Li","doi":"10.1109/TPAMI.2025.3599313","DOIUrl":"https://doi.org/10.1109/TPAMI.2025.3599313","url":null,"abstract":"<p><p>Sign language pre-training (SLP) has significantly improved the performance of diverse sign language understanding (SLU) tasks. However, many existing methods employ pre-training techniques that are tailored to a specific task with small data scale, resulting in limited model generalization. Some others focus solely on exploring visual cues, neglecting semantically textual cues embedded in sign translation texts. These limitations inherently diminish the representative capacity of pre-trained models. To this end, we present a multimodal SLP framework to leverage rich visual contextual information and vision-language semantic consistency with massively available data to enhance the representative capability of sign language video. Specifically, we first curate a large-scale text-labeled sign pose dataset ($sim$ 1.5M), namely SL-1.5M, from various sources to alleviate the scarcity of pre-training data. Subsequently, we propose a pre-training framework, which integrates sign-text contrastive learning with masked pose modeling as the pretext task. In this way, our framework is empowered to effectively capture contextual cues within sign pose sequences and learn visual representation by aligning semantical text-rich features in a latent space. Moreover, in order to grasp the comprehensive meaning of sign language videos, we concurrently model manual and non-manual information to ensure the holistic integrity of visual content. To validate the generalization and superiority of our proposed pre-trained framework, we conduct extensive experiments without intricate design on diverse SLU tasks, achieving new state-of-the-art performance on multiple benchmarks.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144857251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dark Noise Diffusion: Noise Synthesis for Low-Light Image Denoising. 暗噪声扩散:低照度图像去噪的噪声合成。
IF 18.6
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-08-13 DOI: 10.1109/TPAMI.2025.3598330
Liying Lu, Raphael Achddou, Sabine Susstrunk
{"title":"Dark Noise Diffusion: Noise Synthesis for Low-Light Image Denoising.","authors":"Liying Lu, Raphael Achddou, Sabine Susstrunk","doi":"10.1109/TPAMI.2025.3598330","DOIUrl":"https://doi.org/10.1109/TPAMI.2025.3598330","url":null,"abstract":"<p><p>Low-light photography produces images with low signal-to-noise ratios due to limited photons. In such conditions, common approximations like the Gaussian noise model fall short, and many denoising techniques fail to remove noise effectively. Although deep-learning methods perform well, they require large datasets of paired images that are impractical to acquire. As a remedy, synthesizing realistic low-light noise has gained significant attention. In this paper, we investigate the ability of diffusion models to capture the complex distribution of low-light noise. We show that a naive application of conventional diffusion models is inadequate for this task and propose three key adaptations that enable high-precision noise generation: a two-branch architecture to better model signal-dependent and signal-independent noise, the incorporation of positional information to capture fixed-pattern noise, and a tailored diffusion noise schedule. Consequently, our model enables the generation of large datasets for training low-light denoising networks, leading to state-of-the-art performance. Through comprehensive analysis, including statistical evaluation and noise decomposition, we provide deeper insights into the characteristics of the generated data.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144850106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Video Diffusion Posterior Sampling for Seeing Beyond Dynamic Scattering Layers. 视频扩散后验采样超越动态散射层。
IF 18.6
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-08-13 DOI: 10.1109/TPAMI.2025.3598457
Taesung Kwon, Gookho Song, Yoosun Kim, Jeongsol Kim, Jong Chul Ye, Mooseok Jang
{"title":"Video Diffusion Posterior Sampling for Seeing Beyond Dynamic Scattering Layers.","authors":"Taesung Kwon, Gookho Song, Yoosun Kim, Jeongsol Kim, Jong Chul Ye, Mooseok Jang","doi":"10.1109/TPAMI.2025.3598457","DOIUrl":"https://doi.org/10.1109/TPAMI.2025.3598457","url":null,"abstract":"<p><p>Imaging through scattering is challenging, as even a thin layer can randomly perturb light propagation and obscure hidden objects. Accurate closed-form modeling of forward scattering remains difficult, particularly for dynamically varying or thick layers. Here, we introduce a plug-and-play inverse solver based on video diffusion models with a physically grounded forward model tailored to dynamic scattering layers. Our method extends Diffusion Posterior Sampling (DPS) to the spatio-temporal domain, thereby capturing statistical correlations between video frames and scattered signals more effectively. Leveraging these temporal correlations, our approach recovers high-resolution spatial details that spatial-only methods typically fail to reconstruct. We also propose an inference-time optimization with a lightweight mapping network, enabling joint estimation of low-dimensional forward-model parameters without additional training. This joint optimization significantly enhances adaptability to unknown, time-varying degradations, making our method suitable for blind inverse scattering problems. We validate across diverse conditions, including different scene types, layer thicknesses, and scene-layer distances. And real-world experiments using multiple datasets confirm the robustness and effectiveness of our approach, even under real noise and forward-model approximation mismatches. Finally, we validate our method as a general video-restoration framework across dehazing, deblurring, inpainting, and blind restoration under complex optical aberrations. Our implementation is available at: https://github.com/star-kwon/VDPS.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144850177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DreamCraft3D++: Efficient Hierarchical 3D Generation with Multi-Plane Reconstruction Model. DreamCraft3D++:高效分层3D生成与多平面重建模型。
IF 18.6
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-08-13 DOI: 10.1109/TPAMI.2025.3598772
Jingxiang Sun, Cheng Peng, Ruizhi Shao, Yuan-Chen Guo, Xiaochen Zhao, Yangguang Li, YanPei Cao, Bo Zhang, Yebin Liu
{"title":"DreamCraft3D++: Efficient Hierarchical 3D Generation with Multi-Plane Reconstruction Model.","authors":"Jingxiang Sun, Cheng Peng, Ruizhi Shao, Yuan-Chen Guo, Xiaochen Zhao, Yangguang Li, YanPei Cao, Bo Zhang, Yebin Liu","doi":"10.1109/TPAMI.2025.3598772","DOIUrl":"https://doi.org/10.1109/TPAMI.2025.3598772","url":null,"abstract":"<p><p>We introduce DreamCraft3D++, an extension of DreamCraft3D that enables efficient high-quality generation of complex 3D assets. DreamCraft3D++ inherits the multi-stage generation process of DreamCraft3D, but replaces the time-consuming geometry sculpting optimization with a feed-forward multi-plane based reconstruction model, speeding up the process by 1000x. For texture refinement, we propose a training-free IP-Adapter module that is conditioned on the enhanced multi-view images to enhance texture and geometry consistency, providing a 4x faster alternative to DreamCraft3D's DreamBooth fine-tuning. Experiments on diverse datasets demonstrate DreamCraft3D++'s ability to generate creative 3D assets with intricate geometry and realistic 360° textures, outperforming state-of-the-art image-to-3D methods in quality and speed. The full implementation will be open-sourced to enable new possibilities in 3D content creation.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144850108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HHAvatar: Gaussian Head Avatar with Dynamic Hairs. hahavatar:高斯头像与动态头发。
IF 18.6
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-08-13 DOI: 10.1109/TPAMI.2025.3597940
Zhanfeng Liao, Yuelang Xu, Zhe Li, Qijing Li, Boyao Zhou, Ruifeng Bai, Di Xu, Hongwen Zhang, Yebin Liu
{"title":"HHAvatar: Gaussian Head Avatar with Dynamic Hairs.","authors":"Zhanfeng Liao, Yuelang Xu, Zhe Li, Qijing Li, Boyao Zhou, Ruifeng Bai, Di Xu, Hongwen Zhang, Yebin Liu","doi":"10.1109/TPAMI.2025.3597940","DOIUrl":"https://doi.org/10.1109/TPAMI.2025.3597940","url":null,"abstract":"<p><p>Creating high-fidelity 3D head avatars has always been a research hotspot, but it remains a great challenge under lightweight sparse view setups. In this paper, we propose HHAvatar represented by controllable 3D Gaussians for high-fidelity head avatar with dynamic hair modeling. We first use 3D Gaussians to represent the appearance of the head, and then jointly optimize neutral 3D Gaussians and a fully learned MLP-based deformation field to capture complex expressions. The two parts benefit each other, thereby our method can model fine-grained dynamic details while ensuring expression accuracy. Furthermore, we devise a well-designed geometry-guided initialization strategy based on implicit SDF and Deep Marching Tetrahedra for the stability and convergence of the training procedure. To address the problem of dynamic hair modeling, we introduce a hybrid head model into our avatar representation based Gaussian Head Avatar and a training method that considers timing information and an occlusion perception module to model the non-rigid motion of hair. Experiments show that our approach outperforms other state-of-the-art sparse-view methods, achieving ultra high-fidelity rendering quality at 2K resolution even under exaggerated expressions and driving hairs reasonably with the motion of the head. Project page: https://liaozhanfeng.github.io/HHAvatar.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144850110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying Semantic Component for Robust Molecular Property Prediction. 鲁棒分子特性预测的语义成分识别。
IF 18.6
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-08-13 DOI: 10.1109/TPAMI.2025.3598461
Zijian Li, Zunhong Xu, Ruichu Cai, Zhenhui Yang, Yuguang Yan, Zhifeng Hao, Guangyi Chen, Kun Zhang
{"title":"Identifying Semantic Component for Robust Molecular Property Prediction.","authors":"Zijian Li, Zunhong Xu, Ruichu Cai, Zhenhui Yang, Yuguang Yan, Zhifeng Hao, Guangyi Chen, Kun Zhang","doi":"10.1109/TPAMI.2025.3598461","DOIUrl":"https://doi.org/10.1109/TPAMI.2025.3598461","url":null,"abstract":"<p><p>Although graph neural networks have achieved great success in the task of molecular property prediction in recent years, their generalization ability under out-of-distribution (OOD) settings is still under-explored. Most of the existing methods rely on learning discriminative representations for prediction, often assuming that the underlying semantic components are correctly identified. However, this assumption does not always hold, leading to potential misidentifications that affect model robustness. Different from these discriminative-based methods, we propose a generative model to ensure the Semantic-Components Identifiability, named SCI. We demonstrate that the latent variables in this generative model can be explicitly identified into semantic-relevant (SR) and semantic-irrelevant (SI) components, which contributes to better OOD generalization by involving minimal change properties of causal mechanisms. Specifically, we first formulate the data generation process from the atom level to the molecular level, where the latent space is split into SI substructures, SR substructures, and SR atom variables. Sequentially, to reduce misidentification, we restrict the minimal changes of the SR atom variables and add a semantic latent substructure regularization to mitigate the variance of the SR substructure under augmented domain changes. Under mild assumptions, we prove the block-wise identifiability of the SR substructure and the comment-wise identifiability of SR atom variables. Experimental studies achieve state-of-the-art performance and show general improvement on 21 datasets in 3 mainstream benchmarks. Moreover, the visualization results of the proposed SCI method provide insightful case studies and explanations for the prediction results.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144850111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-View Hand Reconstruction With a Point-Embedded Transformer 基于点嵌入变压器的多视图手重构。
IF 18.6
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-08-13 DOI: 10.1109/TPAMI.2025.3598089
Lixin Yang;Licheng Zhong;Pengxiang Zhu;Xinyu Zhan;Junxiao Kong;Jian Xu;Cewu Lu
{"title":"Multi-View Hand Reconstruction With a Point-Embedded Transformer","authors":"Lixin Yang;Licheng Zhong;Pengxiang Zhu;Xinyu Zhan;Junxiao Kong;Jian Xu;Cewu Lu","doi":"10.1109/TPAMI.2025.3598089","DOIUrl":"10.1109/TPAMI.2025.3598089","url":null,"abstract":"This work introduces a novel and generalizable multi-view Hand Mesh Reconstruction (HMR) model, named POEM, designed for practical use in real-world hand motion capture scenarios. The advances of the POEM model consist of two main aspects. First, concerning the modeling of the problem, we propose embedding a static basis point within the multi-view stereo space. A point represents a natural form of 3D information and serves as an ideal medium for fusing features across different views, given its varied projections across these views. Consequently, our method harnesses a simple yet effective idea: a complex 3D hand mesh can be represented by a set of 3D basis points that 1) are embedded in the multi-view stereo, 2) carry features from the multi-view images, and 3) encompass the hand in it. The second advance lies in the training strategy. We utilize a combination of five large-scale multi-view datasets and employ randomization in the number, order, and poses of the cameras. By processing such a vast amount of data and a diverse array of camera configurations, our model demonstrates notable generalizability in the real-world applications. As a result, POEM presents a highly practical, plug-and-play solution that enables user-friendly, cost-effective multi-view motion capture for both left and right hands.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 11","pages":"10680-10695"},"PeriodicalIF":18.6,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144850112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Depth Dynamics via One-Bit Frequency Probing in Embedded Direct Time-of-Flight Sensing. 嵌入式直接飞行时间传感中一位频率探测的深度动态。
IF 18.6
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-08-13 DOI: 10.1109/TPAMI.2025.3598593
Seth Lindgren, Benjamin R Johnson, Lucas J Koerner
{"title":"Depth Dynamics via One-Bit Frequency Probing in Embedded Direct Time-of-Flight Sensing.","authors":"Seth Lindgren, Benjamin R Johnson, Lucas J Koerner","doi":"10.1109/TPAMI.2025.3598593","DOIUrl":"https://doi.org/10.1109/TPAMI.2025.3598593","url":null,"abstract":"<p><p>Time-of-flight (ToF) sensors with single-photon avalanche diodes (SPADs) estimate depth by accumulating a histogram of photon return times, which discards the timing information required to measure depth dynamics, such as vibrations or transient motions. We introduce a method that transforms a direct ToF sensor into a depth frequency analyzer capable of measuring high-frequency motion and transient events using only lightweight, on-sensor computations. By replacing conventional discrete Fourier transforms (DFTs) with one-bit probing sinusoids generated via oversampled sigma-delta modulation, we enable in-pixel frequency analysis without multipliers or floating-point operations. We extend the lightweight analysis of depth dynamics to Haar wavelets for time-localized detection of brief, non-repetitive depth changes. We validate our approach through simulation and hardware experiments, showing that it achieves noise performance approaching that of full-resolution DFTs, detects sub-millimeter motions above 6 kHz, and localizes millisecond-scale transients. Using a laboratory ToF setup, we demonstrate applications in oscillatory motion analysis and depth edge detection. This work has the potential to enable a new class of compact, motion-aware ToF sensors for embedded deployment in industrial predictive maintenance, structural health monitoring, robotic perception, and dynamic scene understanding.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144850107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SPARE: Symmetrized Point-to-Plane Distance for Robust Non-Rigid 3D Registration. SPARE:用于鲁棒非刚性3D配准的对称点到平面距离。
IF 18.6
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-08-13 DOI: 10.1109/TPAMI.2025.3598630
Yuxin Yao, Bailin Deng, Junhui Hou, Juyong Zhang
{"title":"SPARE: Symmetrized Point-to-Plane Distance for Robust Non-Rigid 3D Registration.","authors":"Yuxin Yao, Bailin Deng, Junhui Hou, Juyong Zhang","doi":"10.1109/TPAMI.2025.3598630","DOIUrl":"https://doi.org/10.1109/TPAMI.2025.3598630","url":null,"abstract":"<p><p>Existing optimization-based methods for non-rigid registration typically minimize an alignment error metric based on the point-to-point or point-to-plane distance between corresponding point pairs on the source surface and target surface. However, these metrics can result in slow convergence or a loss of detail. In this paper, we propose SPARE, a novel formulation that utilizes a symmetrized point-to-plane distance for robust non-rigid registration. The symmetrized point-to-plane distance relies on both the positions and normals of the corresponding points, resulting in a more accurate approximation of the underlying geometry and can achieve higher accuracy than existing methods. To solve this optimization problem efficiently, we introduce an as-rigid-as-possible regulation term to estimate the deformed normals and propose an alternating minimization solver using a majorization-minimization strategy. Moreover, for effective initialization of the solver, we incorporate a deformation graph-based coarse alignment that improves registration quality and efficiency. Extensive experiments show that the proposed method greatly improves the accuracy of non-rigid registration problems and maintains relatively high solution efficiency. The code is publicly available at https://github.com/yaoyx689/spare.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144850175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信