{"title":"EPDiff:用于术前多模态图像无监督异常检测的擦除感知扩散模型。","authors":"Jiazheng Wang,Min Liu,Wenting Shen,Renjie Ding,Yaonan Wang,Erik Meijering","doi":"10.1109/tmi.2025.3597545","DOIUrl":null,"url":null,"abstract":"Unsupervised anomaly detection (UAD) methods typically detect anomalies by learning and reconstructing the normative distribution. However, since anomalies constantly invade and affect their surroundings, sub-healthy areas in the junction present structural deformations that could be easily misidentified as anomalies, posing difficulties for UAD methods that solely learn the normative distribution. The use of multimodal images can facilitate to address the above challenges, as they can provide complementary information of anomalies. Therefore, this paper propose a novel method for UAD in preoperative multimodal images, called Erasure Perception Diffusion model (EPDiff). First, the Local Erasure Progressive Training (LEPT) framework is designed to better rebuild sub-healthy structures around anomalies through the diffusion model with a two-phase process. Initially, healthy images are used to capture deviation features labeled as potential anomalies. Then, these anomalies are locally erased in multimodal images to progressively learn sub-healthy structures, obtaining a more detailed reconstruction around anomalies. Second, the Global Structural Perception (GSP) module is developed in the diffusion model to realize global structural representation and correlation within images and between modalities through interactions of high-level semantic information. In addition, a training-free module, named Multimodal Attention Fusion (MAF) module, is presented for weighted fusion of anomaly maps between different modalities and obtaining binary anomaly outputs. Experimental results show that EPDiff improves the AUPRC and mDice scores by 2% and 3.9% on BraTS2021, and by 5.2% and 4.5% on Shifts over the state-of-the-art methods, which proves the applicability of EPDiff in diverse anomaly diagnosis. The code is available at https://github.com/wjiazheng/EPDiff.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"744 1","pages":""},"PeriodicalIF":9.8000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EPDiff: Erasure Perception Diffusion Model for Unsupervised Anomaly Detection in Preoperative Multimodal Images.\",\"authors\":\"Jiazheng Wang,Min Liu,Wenting Shen,Renjie Ding,Yaonan Wang,Erik Meijering\",\"doi\":\"10.1109/tmi.2025.3597545\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Unsupervised anomaly detection (UAD) methods typically detect anomalies by learning and reconstructing the normative distribution. However, since anomalies constantly invade and affect their surroundings, sub-healthy areas in the junction present structural deformations that could be easily misidentified as anomalies, posing difficulties for UAD methods that solely learn the normative distribution. The use of multimodal images can facilitate to address the above challenges, as they can provide complementary information of anomalies. Therefore, this paper propose a novel method for UAD in preoperative multimodal images, called Erasure Perception Diffusion model (EPDiff). First, the Local Erasure Progressive Training (LEPT) framework is designed to better rebuild sub-healthy structures around anomalies through the diffusion model with a two-phase process. Initially, healthy images are used to capture deviation features labeled as potential anomalies. Then, these anomalies are locally erased in multimodal images to progressively learn sub-healthy structures, obtaining a more detailed reconstruction around anomalies. Second, the Global Structural Perception (GSP) module is developed in the diffusion model to realize global structural representation and correlation within images and between modalities through interactions of high-level semantic information. In addition, a training-free module, named Multimodal Attention Fusion (MAF) module, is presented for weighted fusion of anomaly maps between different modalities and obtaining binary anomaly outputs. Experimental results show that EPDiff improves the AUPRC and mDice scores by 2% and 3.9% on BraTS2021, and by 5.2% and 4.5% on Shifts over the state-of-the-art methods, which proves the applicability of EPDiff in diverse anomaly diagnosis. The code is available at https://github.com/wjiazheng/EPDiff.\",\"PeriodicalId\":13418,\"journal\":{\"name\":\"IEEE Transactions on Medical Imaging\",\"volume\":\"744 1\",\"pages\":\"\"},\"PeriodicalIF\":9.8000,\"publicationDate\":\"2025-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Medical Imaging\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1109/tmi.2025.3597545\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Medical Imaging","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/tmi.2025.3597545","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
EPDiff: Erasure Perception Diffusion Model for Unsupervised Anomaly Detection in Preoperative Multimodal Images.
Unsupervised anomaly detection (UAD) methods typically detect anomalies by learning and reconstructing the normative distribution. However, since anomalies constantly invade and affect their surroundings, sub-healthy areas in the junction present structural deformations that could be easily misidentified as anomalies, posing difficulties for UAD methods that solely learn the normative distribution. The use of multimodal images can facilitate to address the above challenges, as they can provide complementary information of anomalies. Therefore, this paper propose a novel method for UAD in preoperative multimodal images, called Erasure Perception Diffusion model (EPDiff). First, the Local Erasure Progressive Training (LEPT) framework is designed to better rebuild sub-healthy structures around anomalies through the diffusion model with a two-phase process. Initially, healthy images are used to capture deviation features labeled as potential anomalies. Then, these anomalies are locally erased in multimodal images to progressively learn sub-healthy structures, obtaining a more detailed reconstruction around anomalies. Second, the Global Structural Perception (GSP) module is developed in the diffusion model to realize global structural representation and correlation within images and between modalities through interactions of high-level semantic information. In addition, a training-free module, named Multimodal Attention Fusion (MAF) module, is presented for weighted fusion of anomaly maps between different modalities and obtaining binary anomaly outputs. Experimental results show that EPDiff improves the AUPRC and mDice scores by 2% and 3.9% on BraTS2021, and by 5.2% and 4.5% on Shifts over the state-of-the-art methods, which proves the applicability of EPDiff in diverse anomaly diagnosis. The code is available at https://github.com/wjiazheng/EPDiff.
期刊介绍:
The IEEE Transactions on Medical Imaging (T-MI) is a journal that welcomes the submission of manuscripts focusing on various aspects of medical imaging. The journal encourages the exploration of body structure, morphology, and function through different imaging techniques, including ultrasound, X-rays, magnetic resonance, radionuclides, microwaves, and optical methods. It also promotes contributions related to cell and molecular imaging, as well as all forms of microscopy.
T-MI publishes original research papers that cover a wide range of topics, including but not limited to novel acquisition techniques, medical image processing and analysis, visualization and performance, pattern recognition, machine learning, and other related methods. The journal particularly encourages highly technical studies that offer new perspectives. By emphasizing the unification of medicine, biology, and imaging, T-MI seeks to bridge the gap between instrumentation, hardware, software, mathematics, physics, biology, and medicine by introducing new analysis methods.
While the journal welcomes strong application papers that describe novel methods, it directs papers that focus solely on important applications using medically adopted or well-established methods without significant innovation in methodology to other journals. T-MI is indexed in Pubmed® and Medline®, which are products of the United States National Library of Medicine.