EPDiff: Erasure Perception Diffusion Model for Unsupervised Anomaly Detection in Preoperative Multimodal Images.

IF 9.8 1区医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

IEEE Transactions on Medical Imaging Pub Date : 2025-08-11 DOI:10.1109/tmi.2025.3597545

Jiazheng Wang,Min Liu,Wenting Shen,Renjie Ding,Yaonan Wang,Erik Meijering

{"title":"EPDiff: Erasure Perception Diffusion Model for Unsupervised Anomaly Detection in Preoperative Multimodal Images.","authors":"Jiazheng Wang,Min Liu,Wenting Shen,Renjie Ding,Yaonan Wang,Erik Meijering","doi":"10.1109/tmi.2025.3597545","DOIUrl":null,"url":null,"abstract":"Unsupervised anomaly detection (UAD) methods typically detect anomalies by learning and reconstructing the normative distribution. However, since anomalies constantly invade and affect their surroundings, sub-healthy areas in the junction present structural deformations that could be easily misidentified as anomalies, posing difficulties for UAD methods that solely learn the normative distribution. The use of multimodal images can facilitate to address the above challenges, as they can provide complementary information of anomalies. Therefore, this paper propose a novel method for UAD in preoperative multimodal images, called Erasure Perception Diffusion model (EPDiff). First, the Local Erasure Progressive Training (LEPT) framework is designed to better rebuild sub-healthy structures around anomalies through the diffusion model with a two-phase process. Initially, healthy images are used to capture deviation features labeled as potential anomalies. Then, these anomalies are locally erased in multimodal images to progressively learn sub-healthy structures, obtaining a more detailed reconstruction around anomalies. Second, the Global Structural Perception (GSP) module is developed in the diffusion model to realize global structural representation and correlation within images and between modalities through interactions of high-level semantic information. In addition, a training-free module, named Multimodal Attention Fusion (MAF) module, is presented for weighted fusion of anomaly maps between different modalities and obtaining binary anomaly outputs. Experimental results show that EPDiff improves the AUPRC and mDice scores by 2% and 3.9% on BraTS2021, and by 5.2% and 4.5% on Shifts over the state-of-the-art methods, which proves the applicability of EPDiff in diverse anomaly diagnosis. The code is available at https://github.com/wjiazheng/EPDiff.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"744 1","pages":""},"PeriodicalIF":9.8000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Medical Imaging","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/tmi.2025.3597545","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Unsupervised anomaly detection (UAD) methods typically detect anomalies by learning and reconstructing the normative distribution. However, since anomalies constantly invade and affect their surroundings, sub-healthy areas in the junction present structural deformations that could be easily misidentified as anomalies, posing difficulties for UAD methods that solely learn the normative distribution. The use of multimodal images can facilitate to address the above challenges, as they can provide complementary information of anomalies. Therefore, this paper propose a novel method for UAD in preoperative multimodal images, called Erasure Perception Diffusion model (EPDiff). First, the Local Erasure Progressive Training (LEPT) framework is designed to better rebuild sub-healthy structures around anomalies through the diffusion model with a two-phase process. Initially, healthy images are used to capture deviation features labeled as potential anomalies. Then, these anomalies are locally erased in multimodal images to progressively learn sub-healthy structures, obtaining a more detailed reconstruction around anomalies. Second, the Global Structural Perception (GSP) module is developed in the diffusion model to realize global structural representation and correlation within images and between modalities through interactions of high-level semantic information. In addition, a training-free module, named Multimodal Attention Fusion (MAF) module, is presented for weighted fusion of anomaly maps between different modalities and obtaining binary anomaly outputs. Experimental results show that EPDiff improves the AUPRC and mDice scores by 2% and 3.9% on BraTS2021, and by 5.2% and 4.5% on Shifts over the state-of-the-art methods, which proves the applicability of EPDiff in diverse anomaly diagnosis. The code is available at https://github.com/wjiazheng/EPDiff.

查看原文本刊更多论文

EPDiff：用于术前多模态图像无监督异常检测的擦除感知扩散模型。

无监督异常检测（UAD）方法通常通过学习和重构规范分布来检测异常。然而，由于异常不断侵入并影响其周围环境，结区内的亚健康区域存在结构变形，容易被误认为是异常，这给仅学习规范分布的UAD方法带来了困难。多模态图像的使用有助于解决上述挑战，因为它们可以提供异常的补充信息。因此，本文提出了一种新的术前多模态图像UAD处理方法，即Erasure Perception Diffusion model （EPDiff）。首先，设计局部擦除渐进训练（Local Erasure Progressive Training， LEPT）框架，通过两阶段过程的扩散模型更好地重建异常周围的亚健康结构。最初，健康图像用于捕获标记为潜在异常的偏差特征。然后，在多模态图像中局部擦除这些异常，逐步学习亚健康结构，获得异常周围更详细的重建。其次，在扩散模型中开发了全局结构感知（Global structure Perception， GSP）模块，通过高级语义信息的交互实现图像内部和模态之间的全局结构表示和关联。此外，提出了一种无需训练的多模态注意融合（Multimodal Attention Fusion， MAF）模块，对不同模态间的异常映射进行加权融合，得到二元异常输出。实验结果表明，与现有方法相比，EPDiff在BraTS2021上的AUPRC和mice得分分别提高了2%和3.9%，在Shifts上的得分分别提高了5.2%和4.5%，证明了EPDiff在各种异常诊断中的适用性。代码可在https://github.com/wjiazheng/EPDiff上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Medical Imaging 医学-成像科学与照相技术

CiteScore

21.80

自引率

5.70%

发文量

637

审稿时长

5.6 months

期刊介绍： The IEEE Transactions on Medical Imaging (T-MI) is a journal that welcomes the submission of manuscripts focusing on various aspects of medical imaging. The journal encourages the exploration of body structure, morphology, and function through different imaging techniques, including ultrasound, X-rays, magnetic resonance, radionuclides, microwaves, and optical methods. It also promotes contributions related to cell and molecular imaging, as well as all forms of microscopy. T-MI publishes original research papers that cover a wide range of topics, including but not limited to novel acquisition techniques, medical image processing and analysis, visualization and performance, pattern recognition, machine learning, and other related methods. The journal particularly encourages highly technical studies that offer new perspectives. By emphasizing the unification of medicine, biology, and imaging, T-MI seeks to bridge the gap between instrumentation, hardware, software, mathematics, physics, biology, and medicine by introducing new analysis methods. While the journal welcomes strong application papers that describe novel methods, it directs papers that focus solely on important applications using medically adopted or well-established methods without significant innovation in methodology to other journals. T-MI is indexed in Pubmed® and Medline®, which are products of the United States National Library of Medicine.