DiffMIC-v2: Medical Image Classification via Improved Diffusion Network

IEEE transactions on medical imaging Pub Date : 2025-01-15 DOI:10.1109/TMI.2025.3530399

Yijun Yang;Huazhu Fu;Angelica I. Aviles-Rivero;Zhaohu Xing;Lei Zhu

{"title":"DiffMIC-v2: Medical Image Classification via Improved Diffusion Network","authors":"Yijun Yang;Huazhu Fu;Angelica I. Aviles-Rivero;Zhaohu Xing;Lei Zhu","doi":"10.1109/TMI.2025.3530399","DOIUrl":null,"url":null,"abstract":"Recently, Denoising Diffusion Models have achieved outstanding success in generative image modeling and attracted significant attention in the computer vision community. Although a substantial amount of diffusion-based research has focused on generative tasks, few studies apply diffusion models to medical diagnosis. In this paper, we propose a diffusion-based network (named DiffMIC-v2) to address general medical image classification by eliminating unexpected noise and perturbations in image representations. To achieve this goal, we first devise an improved dual-conditional guidance strategy that conditions each diffusion step with multiple granularities to enhance step-wise regional attention. Furthermore, we design a novel Heterologous diffusion process that achieves efficient visual representation learning in the latent space. We evaluate the effectiveness of our DiffMIC-v2 on four medical classification tasks with different image modalities, including thoracic diseases classification on chest X-ray, placental maturity grading on ultrasound images, skin lesion classification using dermatoscopic images, and diabetic retinopathy grading using fundus images. Experimental results demonstrate that our DiffMIC-v2 outperforms state-of-the-art methods by a significant margin, which indicates the universality and effectiveness of the proposed model on multi-class and multi-label classification tasks. DiffMIC-v2 can use fewer iterations than our previous DiffMIC to obtain accurate estimations, and also achieves greater runtime efficiency with superior results. The code will be publicly available at <uri>https://github.com/scott-yjyang/DiffMICv2</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 5","pages":"2244-2255"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10843287/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, Denoising Diffusion Models have achieved outstanding success in generative image modeling and attracted significant attention in the computer vision community. Although a substantial amount of diffusion-based research has focused on generative tasks, few studies apply diffusion models to medical diagnosis. In this paper, we propose a diffusion-based network (named DiffMIC-v2) to address general medical image classification by eliminating unexpected noise and perturbations in image representations. To achieve this goal, we first devise an improved dual-conditional guidance strategy that conditions each diffusion step with multiple granularities to enhance step-wise regional attention. Furthermore, we design a novel Heterologous diffusion process that achieves efficient visual representation learning in the latent space. We evaluate the effectiveness of our DiffMIC-v2 on four medical classification tasks with different image modalities, including thoracic diseases classification on chest X-ray, placental maturity grading on ultrasound images, skin lesion classification using dermatoscopic images, and diabetic retinopathy grading using fundus images. Experimental results demonstrate that our DiffMIC-v2 outperforms state-of-the-art methods by a significant margin, which indicates the universality and effectiveness of the proposed model on multi-class and multi-label classification tasks. DiffMIC-v2 can use fewer iterations than our previous DiffMIC to obtain accurate estimations, and also achieves greater runtime efficiency with superior results. The code will be publicly available at https://github.com/scott-yjyang/DiffMICv2.

查看原文本刊更多论文

DiffMIC-v2：基于改进扩散网络的医学图像分类

近年来，去噪扩散模型在生成图像建模方面取得了显著的成功，引起了计算机视觉界的广泛关注。尽管大量基于扩散的研究集中在生成任务上，但很少有研究将扩散模型应用于医学诊断。在本文中，我们提出了一个基于扩散的网络（命名为DiffMIC-v2），通过消除图像表示中的意外噪声和扰动来解决一般医学图像分类问题。为了实现这一目标，我们首先设计了一种改进的双条件引导策略，该策略对每个扩散步骤进行多粒度约束，以增强逐步的区域关注。此外，我们设计了一种新的异源扩散过程，在潜在空间中实现了高效的视觉表征学习。我们评估了DiffMIC-v2在四种不同图像模式下的医学分类任务中的有效性，包括胸部x线的胸部疾病分类、超声图像的胎盘成熟度分级、皮肤镜图像的皮肤病变分类以及眼底图像的糖尿病视网膜病变分级。实验结果表明，我们的DiffMIC-v2在多类别和多标签分类任务上的通用性和有效性明显优于目前最先进的方法。与之前的DiffMIC相比，DiffMIC-v2可以使用更少的迭代来获得准确的估计，并且还可以实现更高的运行时效率和更好的结果。代码将在https://github.com/scott-yjyang/DiffMICv2上公开。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on medical imaging

自引率

0.00%

发文量