深度分组图像配准的贝叶斯无监督解缠方法。

IF 18.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2025-09-12 DOI:10.1109/tpami.2025.3609521

Xinzhe Luo,Xin Wang,Linda Shapiro,Chun Yuan,Jianfeng Feng,Xiahai Zhuang

{"title":"深度分组图像配准的贝叶斯无监督解缠方法。","authors":"Xinzhe Luo,Xin Wang,Linda Shapiro,Chun Yuan,Jianfeng Feng,Xiahai Zhuang","doi":"10.1109/tpami.2025.3609521","DOIUrl":null,"url":null,"abstract":"This article presents a general Bayesian learning framework for multi-modal groupwise image registration. The method builds on probabilistic modelling of the image generative process, where the underlying common anatomy and geometric variations of the observed images are explicitly disentangled as latent variables. Therefore, groupwise image registration is achieved via hierarchical Bayesian inference. We propose a novel hierarchical variational auto-encoding architecture to realise the inference procedure of the latent variables, where the registration parameters can be explicitly estimated in a mathematically interpretable fashion. Remarkably, this new paradigm learns groupwise image registration in an unsupervised closed-loop self-reconstruction process, sparing the burden of designing complex image-based similarity measures. The computationally efficient disentangled network architecture is also inherently scalable and flexible, allowing for groupwise registration on large-scale image groups with variable sizes. Furthermore, the inferred structural representations from multi-modal images via disentanglement learning are capable of capturing the latent anatomy of the observations with visual semantics. Extensive experiments were conducted to validate the proposed framework, including four different datasets from cardiac, brain, and abdominal medical images. The results have demonstrated the superiority of our method over conventional similarity-based approaches in terms of accuracy, efficiency, scalability, and interpretability.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"11 1","pages":""},"PeriodicalIF":18.6000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bayesian Unsupervised Disentanglement of Anatomy and Geometry for Deep Groupwise Image Registration.\",\"authors\":\"Xinzhe Luo,Xin Wang,Linda Shapiro,Chun Yuan,Jianfeng Feng,Xiahai Zhuang\",\"doi\":\"10.1109/tpami.2025.3609521\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article presents a general Bayesian learning framework for multi-modal groupwise image registration. The method builds on probabilistic modelling of the image generative process, where the underlying common anatomy and geometric variations of the observed images are explicitly disentangled as latent variables. Therefore, groupwise image registration is achieved via hierarchical Bayesian inference. We propose a novel hierarchical variational auto-encoding architecture to realise the inference procedure of the latent variables, where the registration parameters can be explicitly estimated in a mathematically interpretable fashion. Remarkably, this new paradigm learns groupwise image registration in an unsupervised closed-loop self-reconstruction process, sparing the burden of designing complex image-based similarity measures. The computationally efficient disentangled network architecture is also inherently scalable and flexible, allowing for groupwise registration on large-scale image groups with variable sizes. Furthermore, the inferred structural representations from multi-modal images via disentanglement learning are capable of capturing the latent anatomy of the observations with visual semantics. Extensive experiments were conducted to validate the proposed framework, including four different datasets from cardiac, brain, and abdominal medical images. The results have demonstrated the superiority of our method over conventional similarity-based approaches in terms of accuracy, efficiency, scalability, and interpretability.\",\"PeriodicalId\":13426,\"journal\":{\"name\":\"IEEE Transactions on Pattern Analysis and Machine Intelligence\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":18.6000,\"publicationDate\":\"2025-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Pattern Analysis and Machine Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/tpami.2025.3609521\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Pattern Analysis and Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tpami.2025.3609521","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

本文提出了一种通用的贝叶斯学习框架，用于多模态分组图像配准。该方法建立在图像生成过程的概率建模上，其中观察到的图像的潜在共同解剖结构和几何变化作为潜在变量被明确地解开。因此，通过层次贝叶斯推理实现图像分组配准。我们提出了一种新的分层变分自编码架构来实现潜在变量的推理过程，其中注册参数可以以数学可解释的方式显式估计。值得注意的是，这种新范式在无监督的闭环自重建过程中学习分组图像配准，从而省去了设计复杂的基于图像的相似性度量的负担。计算效率高的解纠缠网络体系结构也具有固有的可扩展性和灵活性，允许对具有可变大小的大型图像组进行分组配准。此外，通过解纠缠学习从多模态图像中推断出的结构表征能够用视觉语义捕获观察的潜在解剖结构。为了验证所提出的框架，我们进行了大量的实验，包括来自心脏、大脑和腹部医学图像的四种不同数据集。结果表明，我们的方法在准确性、效率、可扩展性和可解释性方面优于传统的基于相似性的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Bayesian Unsupervised Disentanglement of Anatomy and Geometry for Deep Groupwise Image Registration.

This article presents a general Bayesian learning framework for multi-modal groupwise image registration. The method builds on probabilistic modelling of the image generative process, where the underlying common anatomy and geometric variations of the observed images are explicitly disentangled as latent variables. Therefore, groupwise image registration is achieved via hierarchical Bayesian inference. We propose a novel hierarchical variational auto-encoding architecture to realise the inference procedure of the latent variables, where the registration parameters can be explicitly estimated in a mathematically interpretable fashion. Remarkably, this new paradigm learns groupwise image registration in an unsupervised closed-loop self-reconstruction process, sparing the burden of designing complex image-based similarity measures. The computationally efficient disentangled network architecture is also inherently scalable and flexible, allowing for groupwise registration on large-scale image groups with variable sizes. Furthermore, the inferred structural representations from multi-modal images via disentanglement learning are capable of capturing the latent anatomy of the observations with visual semantics. Extensive experiments were conducted to validate the proposed framework, including four different datasets from cardiac, brain, and abdominal medical images. The results have demonstrated the superiority of our method over conventional similarity-based approaches in terms of accuracy, efficiency, scalability, and interpretability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Pattern Analysis and Machine Intelligence 工程技术-工程：电子与电气

CiteScore

28.40

自引率

3.00%

发文量

885

审稿时长

8.5 months

期刊介绍： The IEEE Transactions on Pattern Analysis and Machine Intelligence publishes articles on all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence, with a particular emphasis on machine learning for pattern analysis. Areas such as techniques for visual search, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition and relevant specialized hardware and/or software architectures are also covered.