Synthesizing breast cancer ultrasound images from healthy samples using latent diffusion models.

IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Journal of Medical Imaging Pub Date : 2026-03-01 Epub Date: 2026-03-19 DOI:10.1117/1.JMI.13.2.024002

Yannuo Wen, Kathleen M Curran, Xinzhu Wang, Nuala A Healy, John J Healy

{"title":"Synthesizing breast cancer ultrasound images from healthy samples using latent diffusion models.","authors":"Yannuo Wen, Kathleen M Curran, Xinzhu Wang, Nuala A Healy, John J Healy","doi":"10.1117/1.JMI.13.2.024002","DOIUrl":null,"url":null,"abstract":"Purpose: Breast ultrasound is widely used for cancer screening, but data scarcity and annotation challenges hinder deep learning adoption. Synthetic image generation offers a promising solution to enhance training datasets while preserving patient privacy. However, problems such as inadequate quality of synthesized images and the need for large amounts of data to train the synthesis models remain significant.Approach: We propose a three-stage latent diffusion model (LDM) workflow-enhanced by Vision Transformers and fine-tuned with low-rank adaptation-that synthesizes realistic malignant and benign breast ultrasound images directly from healthy samples while simultaneously generating accurate segmentation masks. Stage division significantly reduces the task complexity of a single synthesis model. Applied to the BUSI dataset (133 healthy, 487 benign, and 210 malignant images), the method generates synthetic cases of each tumor type.Results: A ResNet101 classifier could not reliably distinguish synthetic from real images (AUC = 0.563), indicating high visual plausibility. Quantitative metrics confirmed strong fidelity: Fréchet inception distance = 15.2 and inception score = 1.79, indicating low distributional divergence in feature space and high similarity to real data. When used for training a U-Net segmentation model, the augmented dataset improved the <math><mrow><mi>F</mi> <mn>1</mn></mrow> </math> -score from 0.870 to 0.896, demonstrating substantial gains in diagnostic accuracy.Conclusions: These results show that the proposed three-stage LDM can generate high-quality, anatomically coherent breast cancer images from healthy controls, effectively alleviating data scarcity and enabling more robust training of medical AI systems without compromising clinical realism.","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"13 2","pages":"024002"},"PeriodicalIF":1.7000,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12999972/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1117/1.JMI.13.2.024002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/3/19 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: Breast ultrasound is widely used for cancer screening, but data scarcity and annotation challenges hinder deep learning adoption. Synthetic image generation offers a promising solution to enhance training datasets while preserving patient privacy. However, problems such as inadequate quality of synthesized images and the need for large amounts of data to train the synthesis models remain significant.

Approach: We propose a three-stage latent diffusion model (LDM) workflow-enhanced by Vision Transformers and fine-tuned with low-rank adaptation-that synthesizes realistic malignant and benign breast ultrasound images directly from healthy samples while simultaneously generating accurate segmentation masks. Stage division significantly reduces the task complexity of a single synthesis model. Applied to the BUSI dataset (133 healthy, 487 benign, and 210 malignant images), the method generates synthetic cases of each tumor type.

Results: A ResNet101 classifier could not reliably distinguish synthetic from real images (AUC = 0.563), indicating high visual plausibility. Quantitative metrics confirmed strong fidelity: Fréchet inception distance = 15.2 and inception score = 1.79, indicating low distributional divergence in feature space and high similarity to real data. When used for training a U-Net segmentation model, the augmented dataset improved the $F 1$ -score from 0.870 to 0.896, demonstrating substantial gains in diagnostic accuracy.

Conclusions: These results show that the proposed three-stage LDM can generate high-quality, anatomically coherent breast cancer images from healthy controls, effectively alleviating data scarcity and enabling more robust training of medical AI systems without compromising clinical realism.

查看原文本刊更多论文

利用潜在扩散模型从健康样本中合成乳腺癌超声图像。

目的：乳房超声广泛用于癌症筛查，但数据缺乏和注释挑战阻碍了深度学习的采用。合成图像生成提供了一个有前途的解决方案，以增强训练数据集，同时保护患者的隐私。然而，合成图像质量不高、需要大量数据来训练合成模型等问题仍然存在。方法：我们提出了一个三阶段的潜在扩散模型（LDM）工作流程，该流程通过视觉变形器增强并通过低秩自适应进行微调，直接从健康样本中合成真实的恶性和良性乳房超声图像，同时生成准确的分割掩模。阶段划分显著降低了单个综合模型的任务复杂性。该方法应用于BUSI数据集（133张健康图像，487张良性图像和210张恶性图像），生成每种肿瘤类型的合成病例。结果：ResNet101分类器不能可靠地区分合成图像和真实图像（AUC = 0.563），具有较高的视觉可信度。定量指标证实了较强的保真度：fr起始距离= 15.2，起始分数= 1.79，表明特征空间分布发散度低，与真实数据相似度高。当用于训练U-Net分割模型时，增强的数据集将f1得分从0.870提高到0.896，显示出诊断准确性的大幅提高。结论：这些结果表明，所提出的三阶段LDM可以从健康对照中生成高质量、解剖学上一致的乳腺癌图像，有效缓解数据稀缺，并在不影响临床真实性的情况下实现更强大的医疗人工智能系统训练。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Medical Imaging RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-

CiteScore

4.10

自引率

4.20%

发文量

期刊介绍： JMI covers fundamental and translational research, as well as applications, focused on medical imaging, which continue to yield physical and biomedical advancements in the early detection, diagnostics, and therapy of disease as well as in the understanding of normal. The scope of JMI includes: Imaging physics, Tomographic reconstruction algorithms (such as those in CT and MRI), Image processing and deep learning, Computer-aided diagnosis and quantitative image analysis, Visualization and modeling, Picture archiving and communications systems (PACS), Image perception and observer performance, Technology assessment, Ultrasonic imaging, Image-guided procedures, Digital pathology, Biomedical applications of biomedical imaging. JMI allows for the peer-reviewed communication and archiving of scientific developments, translational and clinical applications, reviews, and recommendations for the field.