Matthew Cobbinah, Henry Nunoo-Mensah, Prince Ebenezer Adjei, Francisca Adoma Acheampong, Isaac Acquah, Eric Tutu Tchao, Andrew Selasi Agbemenu, Emmanuel Abaidoo, Ike Asamoah-Ansah, Obed Kojo Otoo, Amina Salifu, Albert Dede, Julius Adinkrah, Jerry John Kponyo
{"title":"Attn-DeCGAN: A Diversity-Enhanced CycleGAN With Attention for High-Fidelity Medical Image Translation","authors":"Matthew Cobbinah, Henry Nunoo-Mensah, Prince Ebenezer Adjei, Francisca Adoma Acheampong, Isaac Acquah, Eric Tutu Tchao, Andrew Selasi Agbemenu, Emmanuel Abaidoo, Ike Asamoah-Ansah, Obed Kojo Otoo, Amina Salifu, Albert Dede, Julius Adinkrah, Jerry John Kponyo","doi":"10.1002/eng2.70320","DOIUrl":null,"url":null,"abstract":"<p>Unpaired image-to-image translation has emerged as a transformative paradigm in medical imaging, enabling unpaired image translation without the need for aligned datasets. While cycle-consistent generative adversarial networks (CycleGANs) have shown considerable promise in this domain, they remain inherently constrained by the locality of convolutional operations, resulting in global structural inconsistencies, and by mode collapse, which restricts generative diversity. To overcome these limitations, we propose Attn-DeCGAN, a novel attention-augmented, diversity-aware CycleGAN framework designed to enhance both structural fidelity and perceptual diversity in CT-MRI translation tasks. Attn-DeCGAN replaces conventional ResNet-based generators with Hybrid Perception Blocks (HPBs), which synergise depthwise convolutions for spatially efficient local feature extraction with a Dual-Pruned Self-Attention (DPSA) mechanism that enables sparse, content-adaptive modeling of long-range dependencies at linear complexity. This architectural innovation facilitates the modeling of anatomically distant relationships while maintaining inference efficiency. The model is trained using a composite loss function incorporating adversarial, cycle-consistency, identity, and VGG19-based structural consistency losses to preserve both realism and anatomical detail. Extensive empirical evaluations demonstrate that Attn-DeCGAN achieves superior performance across key metrics, including the lowest FID scores (60, 58), highest PSNR (27, 33), and statistically significant improvements in perceptual diversity (LPIPS, <span></span><math>\n <semantics>\n <mrow>\n <mi>p</mi>\n <mo><</mo>\n <mn>0</mn>\n <mo>.</mo>\n <mn>05</mn>\n </mrow>\n <annotation>$$ p<0.05 $$</annotation>\n </semantics></math>) compared to state-of-the-art baselines. Ablation studies underscore the critical role of spectral normalization in stabilizing adversarial training and enhancing attention effectiveness. Expert radiologist assessments confirmed the clinical superiority of Attn-DeCGAN over the next best baseline, DeCGAN, with 100% real classifications and higher confidence scores in CT synthesis, and more anatomically convincing outputs in MRI translation. This has particular utility in low-resource clinical environments where MRI is scarce, supporting synthetic MRI generation for diagnosis, radiotherapy planning, and medical image dataset augmentation. Despite increased training complexity, Attn-DeCGAN retains efficient inference, positioning it as a technically robust and clinically deployable solution for high-fidelity unpaired medical image translation.</p>","PeriodicalId":72922,"journal":{"name":"Engineering reports : open access","volume":"7 8","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/eng2.70320","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering reports : open access","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/eng2.70320","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Unpaired image-to-image translation has emerged as a transformative paradigm in medical imaging, enabling unpaired image translation without the need for aligned datasets. While cycle-consistent generative adversarial networks (CycleGANs) have shown considerable promise in this domain, they remain inherently constrained by the locality of convolutional operations, resulting in global structural inconsistencies, and by mode collapse, which restricts generative diversity. To overcome these limitations, we propose Attn-DeCGAN, a novel attention-augmented, diversity-aware CycleGAN framework designed to enhance both structural fidelity and perceptual diversity in CT-MRI translation tasks. Attn-DeCGAN replaces conventional ResNet-based generators with Hybrid Perception Blocks (HPBs), which synergise depthwise convolutions for spatially efficient local feature extraction with a Dual-Pruned Self-Attention (DPSA) mechanism that enables sparse, content-adaptive modeling of long-range dependencies at linear complexity. This architectural innovation facilitates the modeling of anatomically distant relationships while maintaining inference efficiency. The model is trained using a composite loss function incorporating adversarial, cycle-consistency, identity, and VGG19-based structural consistency losses to preserve both realism and anatomical detail. Extensive empirical evaluations demonstrate that Attn-DeCGAN achieves superior performance across key metrics, including the lowest FID scores (60, 58), highest PSNR (27, 33), and statistically significant improvements in perceptual diversity (LPIPS, ) compared to state-of-the-art baselines. Ablation studies underscore the critical role of spectral normalization in stabilizing adversarial training and enhancing attention effectiveness. Expert radiologist assessments confirmed the clinical superiority of Attn-DeCGAN over the next best baseline, DeCGAN, with 100% real classifications and higher confidence scores in CT synthesis, and more anatomically convincing outputs in MRI translation. This has particular utility in low-resource clinical environments where MRI is scarce, supporting synthetic MRI generation for diagnosis, radiotherapy planning, and medical image dataset augmentation. Despite increased training complexity, Attn-DeCGAN retains efficient inference, positioning it as a technically robust and clinically deployable solution for high-fidelity unpaired medical image translation.
非配对图像到图像的翻译已经成为医学成像的一种变革范例,使非配对图像翻译无需对齐数据集。虽然循环一致生成对抗网络(cyclegan)在这一领域显示出相当大的前景,但它们仍然受到卷积操作局部性的固有限制,导致全局结构不一致,以及模式崩溃,这限制了生成多样性。为了克服这些限制,我们提出了Attn-DeCGAN,这是一种新颖的注意力增强、多样性感知的CycleGAN框架,旨在增强CT-MRI翻译任务中的结构保真度和感知多样性。Attn-DeCGAN用混合感知块(HPBs)取代了传统的基于resnet的生成器,它协同深度卷积进行空间高效的局部特征提取,并采用双修剪自注意(DPSA)机制,实现线性复杂性下远程依赖关系的稀疏、内容自适应建模。这种架构创新促进了解剖学上的远距离关系的建模,同时保持了推理效率。该模型使用复合损失函数进行训练,该损失函数包含对抗、循环一致性、同一性和基于vgg19的结构一致性损失,以保持真实感和解剖细节。广泛的实证评估表明,Attn-DeCGAN在关键指标上取得了卓越的表现,包括最低的FID分数(60,58),最高的PSNR(27,33),以及知觉多样性的统计学显著改善(LPIPS, p <;0 .05美元<;0.05美元),与最先进的基线相比。消融研究强调了谱归一化在稳定对抗训练和提高注意力有效性方面的关键作用。放射科专家的评估证实了Attn-DeCGAN优于下一个最佳基线DeCGAN的临床优势,在CT合成中具有100%的真实分类和更高的置信度评分,在MRI翻译中具有更令人信服的解剖输出。这在缺乏MRI资源的临床环境中具有特别的实用性,支持合成MRI生成用于诊断、放疗计划和医学图像数据集增强。尽管增加了训练复杂性,Attn-DeCGAN保留了高效的推理,将其定位为技术上强大且可用于高保真非配对医学图像翻译的临床部署解决方案。