MLDCGAN: A multimodal latent diffusion conditioned GAN model for accelerated and high-fidelity MRI-CT synthesis in radiotherapy planning

IF 7.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge-Based Systems Pub Date : 2025-09-14 DOI:10.1016/j.knosys.2025.114491

Can Hu , Chunchao Xia , Chuanbing Wang , Xiayu Hang , Xiuhan Li , Han Zhou , Ning Cao

{"title":"MLDCGAN: A multimodal latent diffusion conditioned GAN model for accelerated and high-fidelity MRI-CT synthesis in radiotherapy planning","authors":"Can Hu , Chunchao Xia , Chuanbing Wang , Xiayu Hang , Xiuhan Li , Han Zhou , Ning Cao","doi":"10.1016/j.knosys.2025.114491","DOIUrl":null,"url":null,"abstract":"<div><div>Magnetic resonance imaging (MRI) offers significant advantages in soft tissue contrast. However, it cannot directly provide electron density information for radiotherapy, relying instead on time-consuming and error-prone MRI-CT image registration. Synthetic CT (sCT) technology, which directly generates CT images from MRI, is pivotal for achieving only MRI-based radiotherapy. However, existing synthesis methods based on generative adversarial network (GAN) and diffusion models face challenges such as prolonged inference times and insufficient utilization of multimodal information, which severely hinder the clinical application of synthetic images. In this study, we propose a novel Multimodal Latent Diffusion Conditioned GAN (MLDCGAN) Model. First, we design a non-parametric non-Gaussian complex denoising distribution based on a conditional GAN, employing a multimodal distribution to achieve large-step denoising. This is combined with a pre-trained autoencoder to compress the image into a low-dimensional latent space, significantly reducing inference time. Second, we fully leverage multimodal MRI information by constructing a local refinement conditional generator with multimodal inputs, including T1-Weighted (T1W), T2-Weighted (T2W), and Mask images. The generator is enhanced by an adaptive weighted multi-sequence fusion module and an enhanced cross-attention module, significantly improving the structural consistency and detail fidelity of the generated sCT images. Finally, by jointly optimizing the style loss and content loss, we ensure the perceptual quality and clinical accuracy of the synthetic images. Experimental results demonstrate that MLDCGAN outperforms existing state-of-the-art methods on both public and private datasets, showing significant improvements in both image quality and inference speed. Subjective evaluations from multiple experienced clinicians indicate that the generated sCT images exhibit no significant difference from real CT in terms of key anatomical structure clarity and overall quality (<em>P</em> > 0.05). Further assessments of clinical target delineation and dose distribution confirm that sCT retains anatomical features well and provides dose distributions consistent with real CT, ensuring the reliability of dose calculations in radiotherapy planning. This study provides a more reliable and efficient technical foundation for achieving only MRI-based radiotherapy. It is expected to assist clinicians in developing more precise radiotherapy plans, ultimately improving treatment outcomes in future clinical practice.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"329 ","pages":"Article 114491"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125015308","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Magnetic resonance imaging (MRI) offers significant advantages in soft tissue contrast. However, it cannot directly provide electron density information for radiotherapy, relying instead on time-consuming and error-prone MRI-CT image registration. Synthetic CT (sCT) technology, which directly generates CT images from MRI, is pivotal for achieving only MRI-based radiotherapy. However, existing synthesis methods based on generative adversarial network (GAN) and diffusion models face challenges such as prolonged inference times and insufficient utilization of multimodal information, which severely hinder the clinical application of synthetic images. In this study, we propose a novel Multimodal Latent Diffusion Conditioned GAN (MLDCGAN) Model. First, we design a non-parametric non-Gaussian complex denoising distribution based on a conditional GAN, employing a multimodal distribution to achieve large-step denoising. This is combined with a pre-trained autoencoder to compress the image into a low-dimensional latent space, significantly reducing inference time. Second, we fully leverage multimodal MRI information by constructing a local refinement conditional generator with multimodal inputs, including T1-Weighted (T1W), T2-Weighted (T2W), and Mask images. The generator is enhanced by an adaptive weighted multi-sequence fusion module and an enhanced cross-attention module, significantly improving the structural consistency and detail fidelity of the generated sCT images. Finally, by jointly optimizing the style loss and content loss, we ensure the perceptual quality and clinical accuracy of the synthetic images. Experimental results demonstrate that MLDCGAN outperforms existing state-of-the-art methods on both public and private datasets, showing significant improvements in both image quality and inference speed. Subjective evaluations from multiple experienced clinicians indicate that the generated sCT images exhibit no significant difference from real CT in terms of key anatomical structure clarity and overall quality (P > 0.05). Further assessments of clinical target delineation and dose distribution confirm that sCT retains anatomical features well and provides dose distributions consistent with real CT, ensuring the reliability of dose calculations in radiotherapy planning. This study provides a more reliable and efficient technical foundation for achieving only MRI-based radiotherapy. It is expected to assist clinicians in developing more precise radiotherapy plans, ultimately improving treatment outcomes in future clinical practice.

查看原文本刊更多论文

MLDCGAN：用于放疗计划中加速和高保真MRI-CT合成的多模态潜在扩散条件GAN模型

磁共振成像（MRI）在软组织对比方面具有显著的优势。然而，它不能直接为放疗提供电子密度信息，而是依赖于耗时且容易出错的MRI-CT图像配准。合成CT （sCT）技术直接从MRI生成CT图像，是实现仅基于MRI的放射治疗的关键。然而，现有的基于生成对抗网络（GAN）和扩散模型的合成方法面临推理时间长、多模态信息利用不足等挑战，严重阻碍了合成图像的临床应用。在这项研究中，我们提出了一个新的多模态潜在扩散条件GAN （MLDCGAN）模型。首先，我们设计了一个基于条件GAN的非参数非高斯复去噪分布，采用多模态分布实现大步去噪。这与预训练的自编码器相结合，将图像压缩到低维潜在空间，显着减少了推理时间。其次，我们通过构建具有多模态输入（包括T1-Weighted （T1W）、T2-Weighted （T2W）和Mask图像）的局部细化条件生成器，充分利用了多模态MRI信息。该生成器通过自适应加权多序列融合模块和增强的交叉注意模块进行增强，显著提高了生成的sCT图像的结构一致性和细节保真度。最后，通过对样式损失和内容损失进行联合优化，保证了合成图像的感知质量和临床准确性。实验结果表明，MLDCGAN在公共和私有数据集上都优于现有的最先进的方法，在图像质量和推理速度方面都有显着提高。多名经验丰富的临床医生的主观评价表明，生成的sCT图像在关键解剖结构清晰度和整体质量方面与真实CT没有显著差异（P > 0.05）。对临床靶标描绘和剂量分布的进一步评估证实，sCT很好地保留了解剖特征，并提供了与真实CT一致的剂量分布，确保了放疗计划中剂量计算的可靠性。本研究为实现仅以mri为基础的放射治疗提供了更加可靠、高效的技术基础。它有望帮助临床医生制定更精确的放疗计划，最终在未来的临床实践中改善治疗效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.