MedIENet: medical image enhancement network based on conditional latent diffusion model.

IF 3.2 3区 医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Weizhen Yuan, Yue Feng, Tiancai Wen, Guancong Luo, Jiexin Liang, Qianshuai Sun, Shufen Liang
{"title":"MedIENet: medical image enhancement network based on conditional latent diffusion model.","authors":"Weizhen Yuan, Yue Feng, Tiancai Wen, Guancong Luo, Jiexin Liang, Qianshuai Sun, Shufen Liang","doi":"10.1186/s12880-025-01909-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Deep learning necessitates a substantial amount of data, yet obtaining sufficient medical images is difficult due to concerns about patient privacy and high collection costs.</p><p><strong>Methods: </strong>To address this issue, we propose a conditional latent diffusion model-based medical image enhancement network, referred to as the Medical Image Enhancement Network (MedIENet). To meet the rigorous standards required for image generation in the medical imaging field, a multi-attention module is incorporated in the encoder of the denoising U-Net backbone. Additionally Rotary Position Embedding (RoPE) is integrated into the self-attention module to effectively capture positional information, while cross-attention is utilised to embed integrate class information into the diffusion process.</p><p><strong>Results: </strong>MedIENet is evaluated on three datasets: Chest CT-Scan images, Chest X-Ray Images (Pneumonia), and Tongue dataset. Compared to existing methods, MedIENet demonstrates superior performance in both fidelity and diversity of the generated images. Experimental results indicate that for downstream classification tasks using ResNet50, the Area Under the Receiver Operating Characteristic curve (AUROC) achieved with real data alone is 0.76 for the Chest CT-Scan images dataset, 0.87 for the Chest X-Ray Images (Pneumonia) dataset, and 0.78 for the Tongue Dataset. When using mixed data consisting of real data and generated data, the AUROC improves to 0.82, 0.94, and 0.82, respectively, reflecting increases of approximately 6%, 7%, and 4%.</p><p><strong>Conclusion: </strong>These findings indicate that the images generated by MedIENet can enhance the performance of downstream classification tasks, providing an effective solution to the scarcity of medical image training data.</p>","PeriodicalId":9020,"journal":{"name":"BMC Medical Imaging","volume":"25 1","pages":"372"},"PeriodicalIF":3.2000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12465763/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12880-025-01909-5","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Deep learning necessitates a substantial amount of data, yet obtaining sufficient medical images is difficult due to concerns about patient privacy and high collection costs.

Methods: To address this issue, we propose a conditional latent diffusion model-based medical image enhancement network, referred to as the Medical Image Enhancement Network (MedIENet). To meet the rigorous standards required for image generation in the medical imaging field, a multi-attention module is incorporated in the encoder of the denoising U-Net backbone. Additionally Rotary Position Embedding (RoPE) is integrated into the self-attention module to effectively capture positional information, while cross-attention is utilised to embed integrate class information into the diffusion process.

Results: MedIENet is evaluated on three datasets: Chest CT-Scan images, Chest X-Ray Images (Pneumonia), and Tongue dataset. Compared to existing methods, MedIENet demonstrates superior performance in both fidelity and diversity of the generated images. Experimental results indicate that for downstream classification tasks using ResNet50, the Area Under the Receiver Operating Characteristic curve (AUROC) achieved with real data alone is 0.76 for the Chest CT-Scan images dataset, 0.87 for the Chest X-Ray Images (Pneumonia) dataset, and 0.78 for the Tongue Dataset. When using mixed data consisting of real data and generated data, the AUROC improves to 0.82, 0.94, and 0.82, respectively, reflecting increases of approximately 6%, 7%, and 4%.

Conclusion: These findings indicate that the images generated by MedIENet can enhance the performance of downstream classification tasks, providing an effective solution to the scarcity of medical image training data.

MedIENet:基于条件潜扩散模型的医学图像增强网络。
背景:深度学习需要大量的数据,但由于对患者隐私和高收集成本的担忧,很难获得足够的医学图像。方法:为了解决这一问题,我们提出了一种基于条件潜在扩散模型的医学图像增强网络,称为医学图像增强网络(MedIENet)。为了满足医学成像领域对图像生成的严格要求,在去噪U-Net骨干网的编码器中加入了多关注模块。此外,在自注意模块中集成了旋转位置嵌入(RoPE)来有效地捕获位置信息,而交叉注意则用于将集成的类信息嵌入到扩散过程中。结果:MedIENet在三个数据集上进行评估:胸部ct扫描图像,胸部x射线图像(肺炎)和舌头数据集。与现有方法相比,MedIENet在生成图像的保真度和多样性方面都表现出优异的性能。实验结果表明,对于使用ResNet50的下游分类任务,仅使用真实数据获得的接受者工作特征曲线下面积(AUROC)对于胸部ct扫描图像数据集为0.76,对于胸部x射线图像(肺炎)数据集为0.87,对于舌头数据集为0.78。当使用由真实数据和生成数据组成的混合数据时,AUROC分别提高到0.82、0.94和0.82,分别提高了约6%、7%和4%。结论:这些发现表明MedIENet生成的图像可以提高下游分类任务的性能,有效解决了医学图像训练数据稀缺的问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
BMC Medical Imaging
BMC Medical Imaging RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-
CiteScore
4.60
自引率
3.70%
发文量
198
审稿时长
27 weeks
期刊介绍: BMC Medical Imaging is an open access journal publishing original peer-reviewed research articles in the development, evaluation, and use of imaging techniques and image processing tools to diagnose and manage disease.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信