MMIT-DDPM - Multilateral medical image translation with class and structure supervised diffusion-based model.

IF 7 2区医学 Q1 BIOLOGY

Computers in biology and medicine Pub Date : 2025-02-01 Epub Date: 2024-12-03 DOI:10.1016/j.compbiomed.2024.109501

Sanjeet S Patil, Rishav Rajak, Manojkumar Ramteke, Anurag S Rathore

{"title":"MMIT-DDPM - Multilateral medical image translation with class and structure supervised diffusion-based model.","authors":"Sanjeet S Patil, Rishav Rajak, Manojkumar Ramteke, Anurag S Rathore","doi":"10.1016/j.compbiomed.2024.109501","DOIUrl":null,"url":null,"abstract":"<p><p>Unified translation of medical images from one-to-many distinct modalities is desirable in healthcare settings. A ubiquitous approach for bilateral medical scan translation is one-to-one mapping with GANs. However, its efficacy in encapsulating diversity in a pool of medical scans and performing one-to-many translation is questionable. In contrast, the Denoising Diffusion Probabilistic Model (DDPM) exhibits exceptional ability in image generation due to its scalability and ability to capture the distribution of whole training data. Therefore, we propose a novel conditioning mechanism for the deterministic translation of medical scans to any target modality from a source modality with a DDPM model. This model denoises the target modality under the guidance of a source-modality structure encoder and source-to-target class conditioner. Consequently, this mechanism serves as prior information for sampling the desired target modality during inference. The training and testing have been carried out on the T1-weighted, T2-weighted, and Fluid Attenuated Inversion Recovery (FLAIR) sequence of the BraTS 2021 dataset. The proposed model is capable of unified multi-lateral translation among six combinations of T1ce, T2, and FLAIR sequences of brain MRI, eliminating the need for multiple bilateral translation models. We have analyzed the performance of our architecture against State-of-the-art, Convolution, and Transformer-based GANs. The diffusion model efficiently covers the distribution of multiple modalities while producing better image quality of the translated sequences, as evidenced by the average improvement of 8.06 % in Multi-Scale Structural Similarity (MSSIM) and 2.52 in Fréchet Inception Distance (FID) metrics compared with the CNN and transformer-based GAN architecture.</p>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"185 ","pages":"109501"},"PeriodicalIF":7.0000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.compbiomed.2024.109501","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/3 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Unified translation of medical images from one-to-many distinct modalities is desirable in healthcare settings. A ubiquitous approach for bilateral medical scan translation is one-to-one mapping with GANs. However, its efficacy in encapsulating diversity in a pool of medical scans and performing one-to-many translation is questionable. In contrast, the Denoising Diffusion Probabilistic Model (DDPM) exhibits exceptional ability in image generation due to its scalability and ability to capture the distribution of whole training data. Therefore, we propose a novel conditioning mechanism for the deterministic translation of medical scans to any target modality from a source modality with a DDPM model. This model denoises the target modality under the guidance of a source-modality structure encoder and source-to-target class conditioner. Consequently, this mechanism serves as prior information for sampling the desired target modality during inference. The training and testing have been carried out on the T1-weighted, T2-weighted, and Fluid Attenuated Inversion Recovery (FLAIR) sequence of the BraTS 2021 dataset. The proposed model is capable of unified multi-lateral translation among six combinations of T1ce, T2, and FLAIR sequences of brain MRI, eliminating the need for multiple bilateral translation models. We have analyzed the performance of our architecture against State-of-the-art, Convolution, and Transformer-based GANs. The diffusion model efficiently covers the distribution of multiple modalities while producing better image quality of the translated sequences, as evidenced by the average improvement of 8.06 % in Multi-Scale Structural Similarity (MSSIM) and 2.52 in Fréchet Inception Distance (FID) metrics compared with the CNN and transformer-based GAN architecture.

查看原文本刊更多论文

基于类和结构监督扩散模型的多边医学图像翻译。

在医疗保健环境中，需要对一对多不同模式的医学图像进行统一翻译。一种普遍的双边医学扫描翻译方法是使用gan进行一对一映射。然而，它在将多样性封装在医学扫描库中并执行一对多翻译方面的有效性值得怀疑。相比之下，去噪扩散概率模型（DDPM）由于其可扩展性和捕获整个训练数据分布的能力，在图像生成方面表现出卓越的能力。因此，我们提出了一种新的条件反射机制，用于医学扫描从DDPM模型的源模态到任何目标模态的确定性翻译。该模型在源-模态结构编码器和源-目标类调节器的指导下对目标模态进行降噪。因此，该机制在推理过程中作为对期望目标模态采样的先验信息。对BraTS 2021数据集的t1加权、t2加权和流体衰减反演恢复（FLAIR）序列进行了训练和测试。该模型能够在脑MRI T1ce、T2和FLAIR序列的6种组合中实现统一的多侧翻译，消除了对多个双侧翻译模型的需求。我们分析了我们的架构在基于state -art、Convolution和transformer的gan上的性能。扩散模型有效地覆盖了多模态的分布，同时产生了更好的翻译序列图像质量，与CNN和基于变压器的GAN结构相比，多尺度结构相似性（MSSIM）和fr起始距离（FID）指标平均提高了8.06%和2.52。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers in biology and medicine 工程技术-工程：生物医学

CiteScore

11.70

自引率

10.40%

发文量

1086

审稿时长

74 days

期刊介绍： Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.