Bilin Wang , Changda Lei , Kaicheng Hong , Xiuji Kan , Yifan Ouyang , Junbo Li , Yunbo Guo , Rui Li
{"title":"Diffusion-based image translation from white light to narrow-band imaging in gastrointestinal endoscopy","authors":"Bilin Wang , Changda Lei , Kaicheng Hong , Xiuji Kan , Yifan Ouyang , Junbo Li , Yunbo Guo , Rui Li","doi":"10.1016/j.compmedimag.2025.102605","DOIUrl":null,"url":null,"abstract":"<div><div>Narrow-band imaging (NBI) enhances vascular and mucosal visualization, enabling early detection of gastrointestinal lesions. However, its adoption is limited by hardware constraints and costs, leaving white light endoscopy (WLE) as the widely used but diagnostically inferior modality. Translating WLE into realistic NBI-like images provides a scalable solution to improve diagnostic workflows, generate synthetic datasets, and facilitate multi-modality analysis. Translating WLE images into realistic NBI-like images is challenging due to the lack of paired WLE-NBI image datasets for training and the complex, varied nature of lesions in gastrointestinal endoscopy, which often involve rich details and subtle textures. In this study, we propose a novel diffusion-based framework tailored for WLE-to-NBI image translation. Leveraging stable diffusion with domain-specific enhancements, our method integrates LoRA fine-tuning to embed NBI-specific features and employs a self-attention injection mechanism to dynamically incorporate vascular and mucosal patterns while preserving the spatial structure and semantic integrity of the input WLE images. This approach ensures fine-grained feature translation and structural fidelity crucial for medical applications. Quantitative and qualitative experiments highlight the superiority of the proposed approach in generating high-fidelity NBI-like images. Furthermore, it demonstrates potential for data augmentation and robustness in long-range video frame registration, offering a reliable solution for enhancing clinical decision-making.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"124 ","pages":"Article 102605"},"PeriodicalIF":4.9000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computerized Medical Imaging and Graphics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0895611125001144","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Narrow-band imaging (NBI) enhances vascular and mucosal visualization, enabling early detection of gastrointestinal lesions. However, its adoption is limited by hardware constraints and costs, leaving white light endoscopy (WLE) as the widely used but diagnostically inferior modality. Translating WLE into realistic NBI-like images provides a scalable solution to improve diagnostic workflows, generate synthetic datasets, and facilitate multi-modality analysis. Translating WLE images into realistic NBI-like images is challenging due to the lack of paired WLE-NBI image datasets for training and the complex, varied nature of lesions in gastrointestinal endoscopy, which often involve rich details and subtle textures. In this study, we propose a novel diffusion-based framework tailored for WLE-to-NBI image translation. Leveraging stable diffusion with domain-specific enhancements, our method integrates LoRA fine-tuning to embed NBI-specific features and employs a self-attention injection mechanism to dynamically incorporate vascular and mucosal patterns while preserving the spatial structure and semantic integrity of the input WLE images. This approach ensures fine-grained feature translation and structural fidelity crucial for medical applications. Quantitative and qualitative experiments highlight the superiority of the proposed approach in generating high-fidelity NBI-like images. Furthermore, it demonstrates potential for data augmentation and robustness in long-range video frame registration, offering a reliable solution for enhancing clinical decision-making.
期刊介绍:
The purpose of the journal Computerized Medical Imaging and Graphics is to act as a source for the exchange of research results concerning algorithmic advances, development, and application of digital imaging in disease detection, diagnosis, intervention, prevention, precision medicine, and population health. Included in the journal will be articles on novel computerized imaging or visualization techniques, including artificial intelligence and machine learning, augmented reality for surgical planning and guidance, big biomedical data visualization, computer-aided diagnosis, computerized-robotic surgery, image-guided therapy, imaging scanning and reconstruction, mobile and tele-imaging, radiomics, and imaging integration and modeling with other information relevant to digital health. The types of biomedical imaging include: magnetic resonance, computed tomography, ultrasound, nuclear medicine, X-ray, microwave, optical and multi-photon microscopy, video and sensory imaging, and the convergence of biomedical images with other non-imaging datasets.