Luyi Han, Tao Tan, Yunzhi Huang, Haoran Dou, Tianyu Zhang, Yuan Gao, Xin Wang, Chunyao Lu, Xinglong Liang, Yue Sun, Jonas Teuwen, S Kevin Zhou, Ritse Mann
{"title":"All-in-one medical image-to-image translation.","authors":"Luyi Han, Tao Tan, Yunzhi Huang, Haoran Dou, Tianyu Zhang, Yuan Gao, Xin Wang, Chunyao Lu, Xinglong Liang, Yue Sun, Jonas Teuwen, S Kevin Zhou, Ritse Mann","doi":"10.1016/j.crmeth.2025.101138","DOIUrl":null,"url":null,"abstract":"<p><p>The growing availability of public multi-domain medical image datasets enables training omnipotent image-to-image (I2I) translation models. However, integrating diverse protocols poses challenges in domain encoding and scalability. Therefore, we propose the \"every domain all at once\" I2I (EVA-I2I) translation model using DICOM-tag-informed contrastive language-image pre-training (DCLIP). DCLIP maps natural language scan descriptions into a common latent space, offering richer representations than traditional one-hot encoding. We develop the model using seven public datasets with 27,950 scans (3D volumes) for the brain, breast, abdomen, and pelvis. Experimental results show that our EVA-I2I can synthesize every seen domain at once with a single training session and achieve excellent image quality on different I2I translation tasks. Results for downstream applications (e.g., registration, classification, and segmentation) demonstrate that EVA-I2I can be directly applied to domain adaptation on external datasets without fine-tuning and that it also enables the potential for zero-shot domain adaptation for never-before-seen domains.</p>","PeriodicalId":29773,"journal":{"name":"Cell Reports Methods","volume":" ","pages":"101138"},"PeriodicalIF":4.5000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12461644/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell Reports Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.crmeth.2025.101138","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/11 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
The growing availability of public multi-domain medical image datasets enables training omnipotent image-to-image (I2I) translation models. However, integrating diverse protocols poses challenges in domain encoding and scalability. Therefore, we propose the "every domain all at once" I2I (EVA-I2I) translation model using DICOM-tag-informed contrastive language-image pre-training (DCLIP). DCLIP maps natural language scan descriptions into a common latent space, offering richer representations than traditional one-hot encoding. We develop the model using seven public datasets with 27,950 scans (3D volumes) for the brain, breast, abdomen, and pelvis. Experimental results show that our EVA-I2I can synthesize every seen domain at once with a single training session and achieve excellent image quality on different I2I translation tasks. Results for downstream applications (e.g., registration, classification, and segmentation) demonstrate that EVA-I2I can be directly applied to domain adaptation on external datasets without fine-tuning and that it also enables the potential for zero-shot domain adaptation for never-before-seen domains.
越来越多的公共多域医学图像数据集的可用性使得训练全能的图像到图像(I2I)翻译模型成为可能。然而,集成多种协议在域编码和可扩展性方面提出了挑战。因此,我们提出了使用DICOM-tag-informed对比语言图像预训练(DCLIP)的“every domain all at once”I2I (EVA-I2I)翻译模型。DCLIP将自然语言扫描描述映射到公共潜在空间,提供比传统的单热编码更丰富的表示。我们使用七个公共数据集开发模型,其中包含27,950个扫描(3D体积),用于大脑,乳房,腹部和骨盆。实验结果表明,EVA-I2I可以在一次训练中合成所有可见域,并在不同的I2I翻译任务中获得出色的图像质量。下游应用(例如,注册,分类和分割)的结果表明,EVA-I2I可以直接应用于外部数据集的域适应而无需微调,并且它还可以为从未见过的域提供零shot域适应的潜力。