Sanjeet S Patil, Rishav Rajak, Manojkumar Ramteke, Anurag S Rathore
{"title":"MMIT-DDPM - Multilateral medical image translation with class and structure supervised diffusion-based model.","authors":"Sanjeet S Patil, Rishav Rajak, Manojkumar Ramteke, Anurag S Rathore","doi":"10.1016/j.compbiomed.2024.109501","DOIUrl":null,"url":null,"abstract":"<p><p>Unified translation of medical images from one-to-many distinct modalities is desirable in healthcare settings. A ubiquitous approach for bilateral medical scan translation is one-to-one mapping with GANs. However, its efficacy in encapsulating diversity in a pool of medical scans and performing one-to-many translation is questionable. In contrast, the Denoising Diffusion Probabilistic Model (DDPM) exhibits exceptional ability in image generation due to its scalability and ability to capture the distribution of whole training data. Therefore, we propose a novel conditioning mechanism for the deterministic translation of medical scans to any target modality from a source modality with a DDPM model. This model denoises the target modality under the guidance of a source-modality structure encoder and source-to-target class conditioner. Consequently, this mechanism serves as prior information for sampling the desired target modality during inference. The training and testing have been carried out on the T1-weighted, T2-weighted, and Fluid Attenuated Inversion Recovery (FLAIR) sequence of the BraTS 2021 dataset. The proposed model is capable of unified multi-lateral translation among six combinations of T1ce, T2, and FLAIR sequences of brain MRI, eliminating the need for multiple bilateral translation models. We have analyzed the performance of our architecture against State-of-the-art, Convolution, and Transformer-based GANs. The diffusion model efficiently covers the distribution of multiple modalities while producing better image quality of the translated sequences, as evidenced by the average improvement of 8.06 % in Multi-Scale Structural Similarity (MSSIM) and 2.52 in Fréchet Inception Distance (FID) metrics compared with the CNN and transformer-based GAN architecture.</p>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"185 ","pages":"109501"},"PeriodicalIF":7.0000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.compbiomed.2024.109501","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/3 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Unified translation of medical images from one-to-many distinct modalities is desirable in healthcare settings. A ubiquitous approach for bilateral medical scan translation is one-to-one mapping with GANs. However, its efficacy in encapsulating diversity in a pool of medical scans and performing one-to-many translation is questionable. In contrast, the Denoising Diffusion Probabilistic Model (DDPM) exhibits exceptional ability in image generation due to its scalability and ability to capture the distribution of whole training data. Therefore, we propose a novel conditioning mechanism for the deterministic translation of medical scans to any target modality from a source modality with a DDPM model. This model denoises the target modality under the guidance of a source-modality structure encoder and source-to-target class conditioner. Consequently, this mechanism serves as prior information for sampling the desired target modality during inference. The training and testing have been carried out on the T1-weighted, T2-weighted, and Fluid Attenuated Inversion Recovery (FLAIR) sequence of the BraTS 2021 dataset. The proposed model is capable of unified multi-lateral translation among six combinations of T1ce, T2, and FLAIR sequences of brain MRI, eliminating the need for multiple bilateral translation models. We have analyzed the performance of our architecture against State-of-the-art, Convolution, and Transformer-based GANs. The diffusion model efficiently covers the distribution of multiple modalities while producing better image quality of the translated sequences, as evidenced by the average improvement of 8.06 % in Multi-Scale Structural Similarity (MSSIM) and 2.52 in Fréchet Inception Distance (FID) metrics compared with the CNN and transformer-based GAN architecture.
期刊介绍:
Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.