GEKD: A graph-enhanced knowledge discriminator for universal improvements in medical image synthesis

IF 3.4 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays Pub Date : 2025-09-02 DOI:10.1016/j.displa.2025.103197

Chujie Zhang , Jihong Hu , Yinhao Li , Lanfen Lin , Yen-Wei Chen

{"title":"GEKD: A graph-enhanced knowledge discriminator for universal improvements in medical image synthesis","authors":"Chujie Zhang , Jihong Hu , Yinhao Li , Lanfen Lin , Yen-Wei Chen","doi":"10.1016/j.displa.2025.103197","DOIUrl":null,"url":null,"abstract":"<div><div>Multimodal medical imaging is crucial for comprehensive diagnosis, yet acquiring complete multimodal datasets remains challenging due to economic constraints and technical limitations. Currently, Generative Adversarial Networks (GANs) and Diffusion Models (DMs) represent the two predominant paradigms for medical image synthesis, but they share a critical limitation: convolutional neural networks (CNNs) tend to optimize pixel intensities while neglecting anatomical structural integrity. Although attention mechanisms have been introduced to improve these models, existing methods fail to adequately account for relationships between anatomical regions within images and structural correspondences across different modalities, resulting in inaccurate or incomplete representation of critical regions. This paper presents the Graph-Enhanced Knowledge Discriminator (GEKD), a plug-and-play contextual prior learning module that explicitly models both intra-image and inter-image structural relationships to guide generators toward maintaining anatomical consistency. Inspired by radiology residency training programs, GEKD simulates the cognitive process of medical experts analyzing multimodal images by constructing structural graphs that capture important associations between anatomical regions. Our integration of GEKD with multiple state-of-the-art medical image synthesis methods across four datasets demonstrates that this approach significantly enhances the structural accuracy and clinical relevance of synthesized images, shifting the paradigm from ‘where to look’ to ‘understanding structural relationships.’ By modeling both local (intra-image) and global (inter-image) structural dependencies, GEKD directly addresses the fundamental limitation of existing models that prioritize pixel-level fidelity over structural integrity, providing a broadly applicable solution for diverse medical imaging scenarios.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"91 ","pages":"Article 103197"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225002343","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Multimodal medical imaging is crucial for comprehensive diagnosis, yet acquiring complete multimodal datasets remains challenging due to economic constraints and technical limitations. Currently, Generative Adversarial Networks (GANs) and Diffusion Models (DMs) represent the two predominant paradigms for medical image synthesis, but they share a critical limitation: convolutional neural networks (CNNs) tend to optimize pixel intensities while neglecting anatomical structural integrity. Although attention mechanisms have been introduced to improve these models, existing methods fail to adequately account for relationships between anatomical regions within images and structural correspondences across different modalities, resulting in inaccurate or incomplete representation of critical regions. This paper presents the Graph-Enhanced Knowledge Discriminator (GEKD), a plug-and-play contextual prior learning module that explicitly models both intra-image and inter-image structural relationships to guide generators toward maintaining anatomical consistency. Inspired by radiology residency training programs, GEKD simulates the cognitive process of medical experts analyzing multimodal images by constructing structural graphs that capture important associations between anatomical regions. Our integration of GEKD with multiple state-of-the-art medical image synthesis methods across four datasets demonstrates that this approach significantly enhances the structural accuracy and clinical relevance of synthesized images, shifting the paradigm from ‘where to look’ to ‘understanding structural relationships.’ By modeling both local (intra-image) and global (inter-image) structural dependencies, GEKD directly addresses the fundamental limitation of existing models that prioritize pixel-level fidelity over structural integrity, providing a broadly applicable solution for diverse medical imaging scenarios.

查看原文本刊更多论文

GEKD：一个图形增强的知识鉴别器，用于医学图像合成的普遍改进

多模态医学成像对于全面诊断至关重要，但由于经济约束和技术限制，获取完整的多模态数据集仍然具有挑战性。目前，生成对抗网络（gan）和扩散模型（dm）代表了医学图像合成的两种主要范式，但它们都有一个关键的局限性：卷积神经网络（cnn）倾向于优化像素强度，而忽略了解剖结构的完整性。尽管已经引入了注意机制来改进这些模型，但现有的方法未能充分考虑图像中解剖区域之间的关系以及不同模态之间的结构对应关系，导致关键区域的表示不准确或不完整。本文介绍了图形增强知识鉴别器（GEKD），这是一个即插即用的上下文先验学习模块，它明确地对图像内和图像间的结构关系进行建模，以指导生成器保持解剖一致性。受放射学住院医师培训计划的启发，GEKD通过构建捕获解剖区域之间重要关联的结构图来模拟医学专家分析多模态图像的认知过程。我们将GEKD与跨四个数据集的多种最先进的医学图像合成方法集成在一起，表明这种方法显着提高了合成图像的结构准确性和临床相关性，将范式从“看哪里”转变为“理解结构关系”。通过对局部（图像内）和全局（图像间）结构依赖关系进行建模，GEKD直接解决了现有模型优先考虑像素级保真度而不是结构完整性的基本限制，为各种医学成像场景提供了广泛适用的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Displays 工程技术-工程：电子与电气

CiteScore

4.60

自引率

25.60%

发文量

138

审稿时长

92 days

期刊介绍： Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.