Chujie Zhang , Jihong Hu , Yinhao Li , Lanfen Lin , Yen-Wei Chen
{"title":"GEKD: A graph-enhanced knowledge discriminator for universal improvements in medical image synthesis","authors":"Chujie Zhang , Jihong Hu , Yinhao Li , Lanfen Lin , Yen-Wei Chen","doi":"10.1016/j.displa.2025.103197","DOIUrl":null,"url":null,"abstract":"<div><div>Multimodal medical imaging is crucial for comprehensive diagnosis, yet acquiring complete multimodal datasets remains challenging due to economic constraints and technical limitations. Currently, Generative Adversarial Networks (GANs) and Diffusion Models (DMs) represent the two predominant paradigms for medical image synthesis, but they share a critical limitation: convolutional neural networks (CNNs) tend to optimize pixel intensities while neglecting anatomical structural integrity. Although attention mechanisms have been introduced to improve these models, existing methods fail to adequately account for relationships between anatomical regions within images and structural correspondences across different modalities, resulting in inaccurate or incomplete representation of critical regions. This paper presents the Graph-Enhanced Knowledge Discriminator (GEKD), a plug-and-play contextual prior learning module that explicitly models both intra-image and inter-image structural relationships to guide generators toward maintaining anatomical consistency. Inspired by radiology residency training programs, GEKD simulates the cognitive process of medical experts analyzing multimodal images by constructing structural graphs that capture important associations between anatomical regions. Our integration of GEKD with multiple state-of-the-art medical image synthesis methods across four datasets demonstrates that this approach significantly enhances the structural accuracy and clinical relevance of synthesized images, shifting the paradigm from ‘where to look’ to ‘understanding structural relationships.’ By modeling both local (intra-image) and global (inter-image) structural dependencies, GEKD directly addresses the fundamental limitation of existing models that prioritize pixel-level fidelity over structural integrity, providing a broadly applicable solution for diverse medical imaging scenarios.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"91 ","pages":"Article 103197"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225002343","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Multimodal medical imaging is crucial for comprehensive diagnosis, yet acquiring complete multimodal datasets remains challenging due to economic constraints and technical limitations. Currently, Generative Adversarial Networks (GANs) and Diffusion Models (DMs) represent the two predominant paradigms for medical image synthesis, but they share a critical limitation: convolutional neural networks (CNNs) tend to optimize pixel intensities while neglecting anatomical structural integrity. Although attention mechanisms have been introduced to improve these models, existing methods fail to adequately account for relationships between anatomical regions within images and structural correspondences across different modalities, resulting in inaccurate or incomplete representation of critical regions. This paper presents the Graph-Enhanced Knowledge Discriminator (GEKD), a plug-and-play contextual prior learning module that explicitly models both intra-image and inter-image structural relationships to guide generators toward maintaining anatomical consistency. Inspired by radiology residency training programs, GEKD simulates the cognitive process of medical experts analyzing multimodal images by constructing structural graphs that capture important associations between anatomical regions. Our integration of GEKD with multiple state-of-the-art medical image synthesis methods across four datasets demonstrates that this approach significantly enhances the structural accuracy and clinical relevance of synthesized images, shifting the paradigm from ‘where to look’ to ‘understanding structural relationships.’ By modeling both local (intra-image) and global (inter-image) structural dependencies, GEKD directly addresses the fundamental limitation of existing models that prioritize pixel-level fidelity over structural integrity, providing a broadly applicable solution for diverse medical imaging scenarios.
期刊介绍:
Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface.
Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.