Long Chen , Mobarak I. Hoque , Zhe Min , Matt Clarkson , Thomas Dowrick
{"title":"Controllable illumination invariant GAN for diverse temporally-consistent surgical video synthesis","authors":"Long Chen , Mobarak I. Hoque , Zhe Min , Matt Clarkson , Thomas Dowrick","doi":"10.1016/j.media.2025.103731","DOIUrl":"10.1016/j.media.2025.103731","url":null,"abstract":"<div><div>Surgical video synthesis offers a cost-effective way to expand training data and enhance the performance of machine learning models in computer-assisted surgery. However, existing video translation methods often produce video sequences with large illumination changes across different views, disrupting the temporal consistency of the videos. Additionally, these methods typically synthesize videos with a monotonous style, whereas diverse synthetic data is desired to improve the generalization ability of downstream machine learning models. To address these challenges, we propose a novel Controllable Illumination Invariant Generative Adversarial Network (CIIGAN) for generating diverse, illumination-consistent video sequences. CIIGAN fuses multi-scale illumination-invariant features from a novel controllable illumination-invariant (CII) image space with multi-scale texture-invariant features from self-constructed 3D scenes. The CII image space, along with the 3D scenes, allows CIIGAN to produce diverse and temporally-consistent video or image translations. Extensive experiments demonstrate that CIIGAN achieves more realistic and illumination-consistent translations compared to previous state-of-the-art baselines. Furthermore, the segmentation networks trained on our diverse synthetic data outperform those trained on monotonous synthetic data. Our source code, well-trained models, and 3D simulation scenes are public available at <span><span>https://github.com/LongChenCV/CIIGAN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103731"},"PeriodicalIF":10.7,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144713306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiaxuan Pang , Dongao Ma , Ziyu Zhou , Michael B. Gotway , Jianming Liang
{"title":"POPAR: Patch Order Prediction and Appearance Recovery for self-supervised learning in chest radiography","authors":"Jiaxuan Pang , Dongao Ma , Ziyu Zhou , Michael B. Gotway , Jianming Liang","doi":"10.1016/j.media.2025.103720","DOIUrl":"10.1016/j.media.2025.103720","url":null,"abstract":"<div><div>Self-supervised learning (SSL) has proven effective in reducing the dependency on large annotated datasets while achieving state-of-the-art (SoTA) performance in computer vision. However, its adoption in medical imaging remains slow due to fundamental differences between photographic and medical images. To address this, we propose POPAR (Patch Order Prediction and Appearance Recovery), a novel SSL framework tailored for medical image analysis, particularly chest X-ray interpretation. POPAR introduces two key learning strategies: (1) Patch order prediction, which helps the model learn anatomical structures and spatial relationships by predicting the arrangement of shuffled patches, and (2) Patch appearance recovery, which reconstructs fine-grained details to enhance texture-based feature learning. Using a Swin Transformer backbone, POPAR is pretrained on a large-scale dataset and extensively evaluated across multiple tasks, outperforming both SSL and fully supervised SoTA models in classification, segmentation, anatomical understanding, bias robustness, and data efficiency. Our findings highlight POPAR’s scalability, strong generalization, and effectiveness in medical imaging applications. All code and models are available at <span><span>GitHub.com/JLiangLab/POPAR</span><svg><path></path></svg></span> (Version 2).</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103720"},"PeriodicalIF":10.7,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144670082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiao Zhang , Shaoxuan Wu , Peilin Zhang , Zhuo Jin , Xiaosong Xiong , Qirong Bu , Jingkun Chen , Jun Feng
{"title":"HELPNet: Hierarchical perturbations consistency and entropy-guided ensemble for scribble supervised medical image segmentation","authors":"Xiao Zhang , Shaoxuan Wu , Peilin Zhang , Zhuo Jin , Xiaosong Xiong , Qirong Bu , Jingkun Chen , Jun Feng","doi":"10.1016/j.media.2025.103719","DOIUrl":"10.1016/j.media.2025.103719","url":null,"abstract":"<div><div>Creating fully annotated labels for medical image segmentation is prohibitively time-intensive and costly, emphasizing the necessity for innovative approaches that minimize reliance on detailed annotations. Scribble annotations offer a more cost-effective alternative, significantly reducing the expenses associated with full annotations. However, scribble annotations offer limited and imprecise information, failing to capture the detailed structural and boundary characteristics necessary for accurate organ delineation. To address these challenges, we propose HELPNet, a novel scribble-based weakly supervised segmentation framework, designed to bridge the gap between annotation efficiency and segmentation performance. HELPNet integrates three modules. The Hierarchical perturbations consistency (HPC) module enhances feature learning by employing density-controlled jigsaw perturbations across global, local, and focal views, enabling robust modeling of multi-scale structural representations. Building on this, the Entropy-guided pseudo-label (EGPL) module evaluates the confidence of segmentation predictions using entropy, generating high-quality pseudo-labels. Finally, the Structural prior refinement (SPR) module integrates connectivity analysis and image boundary prior to refine pseudo-label quality and enhance supervision. Experimental results on three public datasets ACDC, MSCMRseg, and CHAOS show that HELPNet significantly outperforms state-of-the-art methods for scribble-based weakly supervised segmentation and achieves performance comparable to fully supervised methods. The code is available at <span><span>https://github.com/IPMI-NWU/HELPNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103719"},"PeriodicalIF":10.7,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144670114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander Koch , Orhun Utku Aydin , Adam Hilbert , Jana Rieger , Satoru Tanioka , Fujimaro Ishida , Dietmar Frey
{"title":"Cross-modality image synthesis from TOF-MRA to CTA using diffusion-based models","authors":"Alexander Koch , Orhun Utku Aydin , Adam Hilbert , Jana Rieger , Satoru Tanioka , Fujimaro Ishida , Dietmar Frey","doi":"10.1016/j.media.2025.103722","DOIUrl":"10.1016/j.media.2025.103722","url":null,"abstract":"<div><div>Cerebrovascular disease often requires multiple imaging modalities for accurate diagnosis, treatment, and monitoring. Computed Tomography Angiography (CTA) and Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) are two common non-invasive angiography techniques, each with distinct strengths in accessibility, safety, and diagnostic accuracy. While CTA is more widely used in acute stroke due to its faster acquisition times and higher diagnostic accuracy, TOF-MRA is preferred for its safety, as it avoids radiation exposure and contrast agent-related health risks. Despite the predominant role of CTA in clinical workflows, there is a scarcity of open-source CTA data, limiting the research and development of AI models for tasks such as large vessel occlusion detection and aneurysm segmentation. This study explores diffusion-based image-to-image translation models to generate synthetic CTA images from TOF-MRA input. We demonstrate the modality conversion from TOF-MRA to CTA and show that diffusion models outperform a traditional U-Net-based approach. Our work compares different state-of-the-art diffusion architectures and samplers, offering recommendations for optimal model performance in this cross-modality translation task.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103722"},"PeriodicalIF":10.7,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144670157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ViFT: Visual field transformer for visual field testing via deep reinforcement learning","authors":"Shozo Saeki, Minoru Kawahara, Hirohisa Aman","doi":"10.1016/j.media.2025.103721","DOIUrl":"10.1016/j.media.2025.103721","url":null,"abstract":"<div><div>Visual field testing (perimetry) quantifies a patient’s visual field sensitivity to diagnosis and follow-up on their visual impairments. Visual field testing would require the patients to concentrate on the test for a long time. However, a longer testing time makes patients more exhausted and leads to a decrease in testing accuracy. Thus, it is helpful to develop a well-designed strategy to finish the testing more quickly while maintaining high accuracy. This paper proposes the visual field transformer (ViFT) for visual field testing with deep reinforcement learning. This study contributes to the following four: (1) ViFT can fully control the visual field testing process. (2) ViFT learns the relationships of visual field locations without any pre-defined information. (3) ViFT learning process can consider the patient perception uncertainty. (4) ViFT achieves the same or higher accuracy than the other strategies, and half as test time as the other strategies. Our experiments demonstrate the ViFT efficiency on the 24-2 test pattern compared with other strategies.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103721"},"PeriodicalIF":10.7,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144662148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jing Zou , Noémie Debroux , Lihao Liu , Jing Qin , Carola-Bibiane Schönlieb , Angelica I. Aviles-Rivero
{"title":"Learning homeomorphic image registration via conformal-invariant hyperelastic regularisation","authors":"Jing Zou , Noémie Debroux , Lihao Liu , Jing Qin , Carola-Bibiane Schönlieb , Angelica I. Aviles-Rivero","doi":"10.1016/j.media.2025.103712","DOIUrl":"10.1016/j.media.2025.103712","url":null,"abstract":"<div><div>Deformable image registration is a fundamental task in medical image analysis and plays a crucial role in a wide range of clinical applications. Recently, deep learning-based approaches have been widely studied for deformable medical image registration and achieved promising results. However, existing deep learning image registration techniques do not theoretically guarantee topology-preserving transformations. This is a key property to preserve anatomical structures and achieve plausible transformations that can be used in real clinical settings. We propose a novel framework for deformable image registration. Firstly, we introduce a novel regulariser based on conformal-invariant properties in a nonlinear elasticity setting. Our regulariser enforces the deformation field to be mooth, invertible and orientation-preserving. More importantly, we strictly guarantee topology preservation yielding to a clinical meaningful registration. Secondly, we boost the performance of our regulariser through coordinate MLPs, where one can view the to-be-registered images as continuously differentiable entities. We demonstrate, through numerical and visual experiments, that our framework is able to outperform current techniques for image registration.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103712"},"PeriodicalIF":10.7,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144654661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ROP lesion segmentation via sequence coding and block balancing","authors":"Xiping Jia , Jianying Qiu , Dong Nie , Tian Liu","doi":"10.1016/j.media.2025.103723","DOIUrl":"10.1016/j.media.2025.103723","url":null,"abstract":"<div><div>Retinopathy of prematurity (ROP) is a potentially blinding retinal disease that often affects low birth weight premature infants. Lesion detection and recognition are crucial for ROP diagnosis and clinical treatment. However, this task poses challenges for both ophthalmologists and computer-based systems due to the small size and subtle nature of many ROP lesions. To address these challenges, we present a Sequence encoding and Block balancing-based Segmentation Network (SeBSNet), which incorporates domain knowledge coding, sequence coding learning (SCL), and block-weighted balancing (BWB) techniques into the segmentation of ROP lesions. The experimental results demonstrate that SeBSNet outperforms existing state-of-the-art methods in the segmentation of ROP lesions, with average ROC_AUC, PR_AUC, and Dice scores of 98.84%, 71.90%, and 66.88%, respectively. Furthermore, the integration of the proposed techniques into ROP classification networks as an enhancing module leads to considerable improvements in classification performance.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103723"},"PeriodicalIF":10.7,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144670066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Su Yang, Ji Yong Han, Sang-Heon Lim, Sujeong Kim, Jungro Lee, Keun-Suh Kim, Jun-Min Kim, Won-Jin Yi
{"title":"DCrownFormer[formula omitted]: Morphology-aware mesh generation and refinement transformer for dental crown prosthesis from 3D scan data of preparation and antagonist teeth","authors":"Su Yang, Ji Yong Han, Sang-Heon Lim, Sujeong Kim, Jungro Lee, Keun-Suh Kim, Jun-Min Kim, Won-Jin Yi","doi":"10.1016/j.media.2025.103717","DOIUrl":"https://doi.org/10.1016/j.media.2025.103717","url":null,"abstract":"Dental prostheses are important in designing artificial replacements to restore the function and appearance of teeth. However, designing a patient-specific dental prosthesis is still labor-intensive and depends on dental professionals with knowledge of oral anatomy and their experience. Also, this procedure is time-consuming because the initial tooth template for designing dental crowns is not personalized. In this paper, we propose a novel end-to-end morphology-aware mesh generation and refinement transformer called DCrownFormer<mml:math altimg=\"si12.svg\" display=\"inline\"><mml:msup><mml:mrow></mml:mrow><mml:mrow><mml:mo>+</mml:mo></mml:mrow></mml:msup></mml:math> to directly and efficiently generate high-fidelity and realistic meshes for dental crowns from the mesh inputs of 3D scans of preparation and antagonist teeth. DCrownFormer<mml:math altimg=\"si12.svg\" display=\"inline\"><mml:msup><mml:mrow></mml:mrow><mml:mrow><mml:mo>+</mml:mo></mml:mrow></mml:msup></mml:math> captures local and global geometric features from mesh inputs using a geometric feature attention descriptor and the transformer encoder. We leverage a morphology-aware cross-attention module with curvature-penalty Chamfer distance loss (<mml:math altimg=\"si406.svg\" display=\"inline\"><mml:mrow><mml:mi>C</mml:mi><mml:mi>P</mml:mi><mml:mi>L</mml:mi></mml:mrow></mml:math>) to generate the points and normals of a dental crown from geometric features at the transformer decoder. Then, a coarse indicator grid is directly estimated from the generated points and normals of the dental crown using differentiable Poisson surface reconstruction. To further improve the fine details of the occlusal surfaces, we propose a learning-based refinement method called implicit grid refinement network with a gradient-penalty mesh reconstruction loss (<mml:math altimg=\"si133.svg\" display=\"inline\"><mml:mrow><mml:mi>G</mml:mi><mml:mi>P</mml:mi><mml:mi>L</mml:mi></mml:mrow></mml:math>) to generate high-fidelity and realistic dental crown meshes by refining the details of the coarse indicator grid. Our experimental results demonstrate that DCrownFormer<mml:math altimg=\"si12.svg\" display=\"inline\"><mml:msup><mml:mrow></mml:mrow><mml:mrow><mml:mo>+</mml:mo></mml:mrow></mml:msup></mml:math> is superior to other methods in improving the shape completeness, surface smoothness, and morphological details of occlusal surfaces, such as dental grooves and cusps. We further validate the effectiveness of key components and the significant benefits of <mml:math altimg=\"si406.svg\" display=\"inline\"><mml:mrow><mml:mi>C</mml:mi><mml:mi>P</mml:mi><mml:mi>L</mml:mi></mml:mrow></mml:math> and <mml:math altimg=\"si133.svg\" display=\"inline\"><mml:mrow><mml:mi>G</mml:mi><mml:mi>P</mml:mi><mml:mi>L</mml:mi></mml:mrow></mml:math> through ablation studies.","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"53 1","pages":""},"PeriodicalIF":10.9,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144670084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Attention-enhanced Dual-stream Registration Network via Mixed Attention Transformer and Gated Adaptive Fusion","authors":"Yuan Chang, Zheng Li","doi":"10.1016/j.media.2025.103713","DOIUrl":"10.1016/j.media.2025.103713","url":null,"abstract":"<div><div>Deformable registration requires extracting salient features within each image and finding feature pairs with potential matching possibilities between the moving and fixed images, thereby estimating the deformation field used to align the images to be registered. With the development of deep learning, various deformable registration networks utilizing advanced architectures such as CNNs or Transformers have been proposed, showing excellent registration performance. However, existing works fail to effectively achieve both feature extraction within images and feature matching between images simultaneously. In this paper, we propose a novel Attention-enhanced Dual-stream Registration Network (ADRNet) for deformable brain MRI registration. First, we use parallel CNN modules to extract shallow features from the moving and fixed images separately. Then, we propose a Mixed Attention Transformer (MAT) module with self-attention, cross-attention, and local attention to model self-correlation and cross-correlation to find features for matching. Finally, we improve skip connections, a key component of U-shape networks ignored by existing methods. We propose a Gated Adaptive Fusion (GAF) module with a gate mechanism, using decoding features to control the encoding features transmitted through skip connections, to better integrate encoder–decoder features, thereby obtaining matching features with more accurate one-to-one correspondence. The extensive and comprehensive experiments on three public brain MRI datasets demonstrate that our method achieves state-of-the-art registration performance.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103713"},"PeriodicalIF":10.7,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144670083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Myeongkyun Kang , Philip Chikontwe , Soopil Kim , Kyong Hwan Jin , Ehsan Adeli , Kilian M. Pohl , Sang Hyun Park
{"title":"Efficient one-shot federated learning on medical data using knowledge distillation with image synthesis and client model adaptation","authors":"Myeongkyun Kang , Philip Chikontwe , Soopil Kim , Kyong Hwan Jin , Ehsan Adeli , Kilian M. Pohl , Sang Hyun Park","doi":"10.1016/j.media.2025.103714","DOIUrl":"10.1016/j.media.2025.103714","url":null,"abstract":"<div><div>One-shot federated learning (FL) has emerged as a promising solution in scenarios where multiple communication rounds are not practical. Though previous methods using knowledge distillation (KD) with synthetic images have shown promising results in transferring clients’ knowledge to the global model on one-shot FL, overfitting and extensive computations still persist. To tackle these issues, we propose a novel one-shot FL framework that generates pseudo intermediate samples using mixup, which incorporates synthesized images with diverse types of structure noise. This approach (i) enhances the diversity of training samples, preventing overfitting and providing informative visual clues for effective training and (ii) allows for the reuse of synthesized images, reducing computational resources and improving overall training efficiency. To mitigate domain disparity introduced by noise, we design noise-adapted client models by updating batch normalization statistics on noise to enhance KD. With these in place, the training process involves iteratively updating the global model through KD with both the original and noise-adapted client models using pseudo-generated images. Extensive evaluations on five small-sized and three regular-sized medical image classification datasets demonstrate the superiority of our approach over previous methods.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103714"},"PeriodicalIF":10.7,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144654660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}