Yichi Zhang , Bohao Lv , Le Xue , Wenbo Zhang , Yuchen Liu , Yu Fu , Yuan Cheng , Yuan Qi
{"title":"SemiSAM+: Rethinking semi-supervised medical image segmentation in the era of foundation models","authors":"Yichi Zhang , Bohao Lv , Le Xue , Wenbo Zhang , Yuchen Liu , Yu Fu , Yuan Cheng , Yuan Qi","doi":"10.1016/j.media.2025.103733","DOIUrl":"10.1016/j.media.2025.103733","url":null,"abstract":"<div><div>Deep learning-based medical image segmentation typically requires large amount of labeled data for training, making it less applicable in clinical settings due to high annotation cost. Semi-supervised learning (SSL) has emerged as an appealing strategy due to its less dependence on acquiring abundant annotations from experts compared to fully supervised methods. Beyond existing model-centric advancements of SSL by designing novel regularization strategies, we anticipate a paradigmatic shift due to the emergence of promptable segmentation foundation models with universal segmentation capabilities using positional prompts represented by Segment Anything Model (SAM). In this paper, we present <strong>SemiSAM+</strong>, a foundation model-driven SSL framework to efficiently learn from limited labeled data for medical image segmentation. SemiSAM+ consists of one or multiple promptable foundation models as <strong>generalist models</strong>, and a trainable task-specific segmentation model as <strong>specialist model</strong>. For a given new segmentation task, the training is based on the specialist–generalist collaborative learning procedure, where the trainable specialist model delivers positional prompts to interact with the frozen generalist models to acquire pseudo-labels, and then the generalist model output provides the specialist model with informative and efficient supervision which benefits the automatic segmentation and prompt generation in turn. Extensive experiments on three public datasets and one in-house clinical dataset demonstrate that SemiSAM+ achieves significant performance improvement, especially under extremely limited annotation scenarios, and shows strong efficiency as a plug-and-play strategy that can be easily adapted to different specialist and generalist models.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103733"},"PeriodicalIF":11.8,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144780368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alvaro Gonzalez-Jimenez , Simone Lionetti , Philippe Gottfrois , Fabian Gröger , Alexander Navarini , Marc Pouly
{"title":"Robust T-Loss for medical image segmentation","authors":"Alvaro Gonzalez-Jimenez , Simone Lionetti , Philippe Gottfrois , Fabian Gröger , Alexander Navarini , Marc Pouly","doi":"10.1016/j.media.2025.103735","DOIUrl":"10.1016/j.media.2025.103735","url":null,"abstract":"<div><div>This work introduces T-Loss, a novel and robust loss function for medical image segmentation. T-Loss is derived from the negative log-likelihood of the Student-t distribution and excels at handling noisy masks by dynamically controlling its sensitivity through a single parameter. This parameter is optimized during the backpropagation process, obviating the need for additional computations or prior knowledge about the extent and distribution of noisy labels. We provide in-depth analysis of this parameter behavior during training and revealing its adaptive nature and its role in preventing noisy memorization. Our extensive experiments demonstrate that T-Loss significantly outperforms traditional loss functions in terms of dice scores on two public medical datasets, specifically for skin lesion and lung segmentation. Moreover, T-Loss exhibits remarkable resilience to various types of simulated label noise, which mimics human annotation errors. Our results provide strong evidence that T-Loss is a promising alternative for medical image segmentation where high levels of noise or outliers in the dataset are a typical phenomenon in practice. The project website, including code and additional resources, can be found at: <span><span>https://robust-tloss.github.io/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103735"},"PeriodicalIF":11.8,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144724975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Long Chen , Mobarak I. Hoque , Zhe Min , Matt Clarkson , Thomas Dowrick
{"title":"Controllable illumination invariant GAN for diverse temporally-consistent surgical video synthesis","authors":"Long Chen , Mobarak I. Hoque , Zhe Min , Matt Clarkson , Thomas Dowrick","doi":"10.1016/j.media.2025.103731","DOIUrl":"10.1016/j.media.2025.103731","url":null,"abstract":"<div><div>Surgical video synthesis offers a cost-effective way to expand training data and enhance the performance of machine learning models in computer-assisted surgery. However, existing video translation methods often produce video sequences with large illumination changes across different views, disrupting the temporal consistency of the videos. Additionally, these methods typically synthesize videos with a monotonous style, whereas diverse synthetic data is desired to improve the generalization ability of downstream machine learning models. To address these challenges, we propose a novel Controllable Illumination Invariant Generative Adversarial Network (CIIGAN) for generating diverse, illumination-consistent video sequences. CIIGAN fuses multi-scale illumination-invariant features from a novel controllable illumination-invariant (CII) image space with multi-scale texture-invariant features from self-constructed 3D scenes. The CII image space, along with the 3D scenes, allows CIIGAN to produce diverse and temporally-consistent video or image translations. Extensive experiments demonstrate that CIIGAN achieves more realistic and illumination-consistent translations compared to previous state-of-the-art baselines. Furthermore, the segmentation networks trained on our diverse synthetic data outperform those trained on monotonous synthetic data. Our source code, well-trained models, and 3D simulation scenes are public available at <span><span>https://github.com/LongChenCV/CIIGAN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103731"},"PeriodicalIF":10.7,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144713306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wentao Liu , Zhiwei Ni , Xuhui Zhu , Qian Chen , Liping Ni , Pingfan Xia
{"title":"Spectrum intervention based invariant causal representation learning for single-domain generalizable medical image segmentation","authors":"Wentao Liu , Zhiwei Ni , Xuhui Zhu , Qian Chen , Liping Ni , Pingfan Xia","doi":"10.1016/j.media.2025.103741","DOIUrl":"10.1016/j.media.2025.103741","url":null,"abstract":"<div><div>The performance of a well-trained segmentation model is often trapped by domain shift caused by acquisition variance. Existing efforts are devoted to expanding the diversity of single-source samples, as well as learning domain-invariant representations. Essentially, they are still modeling the statistical dependence between sample-label pairs to achieve a superficial portrayal of reality. On the contrary, we propose a Spectrum Intervention based Invariant Causal Representation Learning (SI<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>CRL) framework, to unify the data generation and representation learning from causal view. Specifically, for the data generation, the unknown object elements can be reified in frequency domain as phase variables, then we propose an amplitude-based intervention module to generate low-frequency perturbations via random-weighted multilayer convolutional network. For the causal representations, a two-stage causal synergy modeling process is proposed to derive unobservable causal factors. In the first stage, the style-sensitive non-causal factors lying in the shallow layer of encoder are filtered out by contrastive-based causal decoupling mechanism. In the second stage, the hierarchical features in decoder are first factorized with cross-covariance regularization to ensure channel-wise independence; Subsequently, we introduce an adversarial-based causal purification module, which encourages the decoder to iteratively update causally sufficient information and make domain-robust predictions. We evaluate our SI<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>CRL against the state-of-the-art methods on cross-site prostate MRI segmentation, cross-modality (CT-MRI) abdominal multi-organ segmentation, and cross-sequence (MRI) cardiac segmentation. Our approach achieves consistent performance gains compared to these peer methods.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103741"},"PeriodicalIF":11.8,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144738203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SET: Superpixel Embedded Transformer for skin lesion segmentation","authors":"Zhonghua Wang , Junyan Lyu , Xiaoying Tang","doi":"10.1016/j.media.2025.103738","DOIUrl":"10.1016/j.media.2025.103738","url":null,"abstract":"<div><div>Accurate skin lesion segmentation is crucial for the early detection and treatment of skin cancer. Despite significant advances in deep learning, current segmentation methods often struggle to fully capture global contextual information and maintain the structural integrity of skin lesions. To address these challenges, this paper introduces Superpixel Embedded Transformer (SET), which integrates superpixels into the Transformer framework for skin lesion segmentation. Instead of embedding non-overlapping patches as tokens, SET employs an Association Embedded Merging & Dispatching (AEM&D) module to treat superpixels as the fundamental units during both the down-sampling and up-sampling phases. To better capture the multi-scale information of lesions, we propose a superpixel bank to store various superpixel maps with distinct compactness values. An Ensemble Fusion and Refinery (EFR) module is then designed to fuse and refine the results obtained from each map in the superpixel bank. This approach enables the model to selectively focus on different features by adopting various superpixel maps, thereby enhancing the segmentation performance. Extensive experiments are conducted on multiple skin lesion segmentation datasets, including ISIC 2016, ISIC 2017, and ISIC 2018. Comparative analyses with state-of-the-art methods showcase SET’s superior performance, and ablation studies confirm the effectiveness of our proposed modules incorporating superpixels into Vision Transformer. The source code of our SET will be available at <span><span>https://github.com/Wzhjerry/SET</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103738"},"PeriodicalIF":11.8,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144725051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adrito Das , Danyal Z. Khan , Dimitrios Psychogyios , Yitong Zhang , John G. Hanrahan , Francisco Vasconcelos , You Pang , Zhen Chen , Jinlin Wu , Xiaoyang Zou , Guoyan Zheng , Abdul Qayyum , Moona Mazher , Imran Razzak , Tianbin Li , Jin Ye , Junjun He , Szymon Płotka , Joanna Kaleta , Amine Yamlahi , Sophia Bano
{"title":"PitVis-2023 challenge: Workflow recognition in videos of endoscopic pituitary surgery","authors":"Adrito Das , Danyal Z. Khan , Dimitrios Psychogyios , Yitong Zhang , John G. Hanrahan , Francisco Vasconcelos , You Pang , Zhen Chen , Jinlin Wu , Xiaoyang Zou , Guoyan Zheng , Abdul Qayyum , Moona Mazher , Imran Razzak , Tianbin Li , Jin Ye , Junjun He , Szymon Płotka , Joanna Kaleta , Amine Yamlahi , Sophia Bano","doi":"10.1016/j.media.2025.103716","DOIUrl":"10.1016/j.media.2025.103716","url":null,"abstract":"<div><div>The field of computer vision applied to videos of minimally invasive surgery is ever-growing. Workflow recognition pertains to the automated recognition of various aspects of a surgery, including: which surgical steps are performed; and which surgical instruments are used. This information can later be used to assist clinicians when learning the surgery or during live surgery. The Pituitary Vision (PitVis) 2023 Challenge tasks the community to step and instrument recognition in videos of endoscopic pituitary surgery. This is a particularly challenging task when compared to other minimally invasive surgeries due to: the smaller working space, which limits and distorts vision; and higher frequency of instrument and step switching, which requires more precise model predictions. Participants were provided with 25-videos, with results presented at the MICCAI-2023 conference as part of the Endoscopic Vision 2023 Challenge in Vancouver, Canada, on 08-Oct-2023. There were 18-submissions from 9-teams across 6-countries, using a variety of deep learning models. The top performing model for step recognition utilised a transformer based architecture, uniquely using an autoregressive decoder with a positional encoding input. The top performing model for instrument recognition utilised a spatial encoder followed by a temporal encoder, which uniquely used a 2-layer temporal architecture. In both cases, these models outperformed purely spatial based models, illustrating the importance of sequential and temporal information. This PitVis-2023 therefore demonstrates state-of-the-art computer vision models in minimally invasive surgery are transferable to a new dataset. Benchmark results are provided in the paper, and the dataset is publicly available at: <span><span>https://doi.org/10.5522/04/26531686</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103716"},"PeriodicalIF":11.8,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144780369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shifang Zhao , Long Bai , Kun Yuan , Feng Li , Jieming Yu , Wenzhen Dong , Guankun Wang , Mobarakol Islam , Nicolas Padoy , Nassir Navab , Hongliang Ren
{"title":"Rethinking data imbalance in class incremental surgical instrument segmentation","authors":"Shifang Zhao , Long Bai , Kun Yuan , Feng Li , Jieming Yu , Wenzhen Dong , Guankun Wang , Mobarakol Islam , Nicolas Padoy , Nassir Navab , Hongliang Ren","doi":"10.1016/j.media.2025.103728","DOIUrl":"10.1016/j.media.2025.103728","url":null,"abstract":"<div><div>In surgical instrument segmentation, the increasing variety of instruments over time poses a significant challenge for existing neural networks, as they are unable to effectively learn such incremental tasks and suffer from catastrophic forgetting. When learning new data, the model experiences a sharp performance drop on previously learned data. Although several continual learning methods have been proposed for incremental understanding tasks in surgical scenarios, the issue of data imbalance often leads to a strong bias in the segmentation head, resulting in poor performance. Data imbalance can occur in two forms: (i) class imbalance between new and old data, and (ii) class imbalance within the same time point of data. Such imbalances often cause the dominant classes to take over the training process of continual semantic segmentation (CSS). To address this issue, we propose <strong>SurgCSS</strong>, a novel plug-and-play CSS framework for surgical instrument segmentation under data imbalance. Specifically, we generate realistic surgical backgrounds through inpainting and blend instrument foregrounds with the generated backgrounds in a class-aware manner to balance the data distribution in various scenarios. We further propose the Class Desensitization Loss by employing contrastive learning to correct edge biases caused by data imbalance. Moreover, we dynamically fuse the weight parameters of the old and new models to achieve a better trade-off between the biased and unbiased model weights. To investigate the data imbalance problem in surgical scenarios, we construct a new benchmark for surgical instrument CSS by integrating four public datasets: EndoVis 2017, EndoVis 2018, CholecSeg8k, and SAR-RAPR50. Extensive experiments demonstrate the effectiveness of the proposed framework, achieving significant performance improvement against existing baselines. Our method demonstrates excellent potential for clinical applications. The code is publicly available at <span><span>github.com/Zzsf11/SurgCSS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103728"},"PeriodicalIF":11.8,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144724974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new dataset and versatile multi-task surgical workflow analysis framework for thoracoscopic mitral valvuloplasty","authors":"Meng Lan , Weixin Si , Xinjian Yan , Xiaomeng Li","doi":"10.1016/j.media.2025.103724","DOIUrl":"10.1016/j.media.2025.103724","url":null,"abstract":"<div><div>Surgical Workflow Analysis (SWA) on videos is critical for AI-assisted intelligent surgery. Existing SWA methods primarily focus on laparoscopic surgeries, while research on complex thoracoscopy-assisted cardiac surgery remains largely unexplored. In this paper, we introduce <strong>TMVP-SurgVideo</strong>, the first SWA video dataset for thoracoscopic cardiac mitral valvuloplasty (TMVP). TMVP-SurgVideo comprises 57 independent long-form surgical videos and over 429K annotated frames, covering four key tasks, namely phase and instrument recognitions, and phase and instrument anticipations. To achieve a comprehensive SWA system for TMVP and overcome the limitations of current SWA methods, we propose <strong>SurgFormer</strong>, the first query-based Transformer framework that simultaneously performs recognition and anticipation of surgical phases and instruments. SurgFormer uses four low-dimensional learnable task embeddings to independently decode representation embeddings for the predictions of the four tasks. During the decoding process, an information interaction module that contains the intra-frame task-level information interaction layer and the inter-frame temporal correlation learning layer is devised to operate on the task embeddings, enabling the information collaboration between tasks within each frame and temporal correlation learning of each task across frames. Besides, SurgFormer’s unique architecture allows it to perform both offline and online inferences using a dynamic memory bank without model modification. Our proposed SurgFormer is evaluated on the TMVP-SurgVideo and existing Cholec80 datasets to demonstrate its effectiveness on SWA. The dataset and the code are available at <span><span>https://github.com/xmed-lab/SurgFormer</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103724"},"PeriodicalIF":11.8,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144725052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiaxuan Pang , Dongao Ma , Ziyu Zhou , Michael B. Gotway , Jianming Liang
{"title":"POPAR: Patch Order Prediction and Appearance Recovery for self-supervised learning in chest radiography","authors":"Jiaxuan Pang , Dongao Ma , Ziyu Zhou , Michael B. Gotway , Jianming Liang","doi":"10.1016/j.media.2025.103720","DOIUrl":"10.1016/j.media.2025.103720","url":null,"abstract":"<div><div>Self-supervised learning (SSL) has proven effective in reducing the dependency on large annotated datasets while achieving state-of-the-art (SoTA) performance in computer vision. However, its adoption in medical imaging remains slow due to fundamental differences between photographic and medical images. To address this, we propose POPAR (Patch Order Prediction and Appearance Recovery), a novel SSL framework tailored for medical image analysis, particularly chest X-ray interpretation. POPAR introduces two key learning strategies: (1) Patch order prediction, which helps the model learn anatomical structures and spatial relationships by predicting the arrangement of shuffled patches, and (2) Patch appearance recovery, which reconstructs fine-grained details to enhance texture-based feature learning. Using a Swin Transformer backbone, POPAR is pretrained on a large-scale dataset and extensively evaluated across multiple tasks, outperforming both SSL and fully supervised SoTA models in classification, segmentation, anatomical understanding, bias robustness, and data efficiency. Our findings highlight POPAR’s scalability, strong generalization, and effectiveness in medical imaging applications. All code and models are available at <span><span>GitHub.com/JLiangLab/POPAR</span><svg><path></path></svg></span> (Version 2).</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103720"},"PeriodicalIF":10.7,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144670082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiao Zhang , Shaoxuan Wu , Peilin Zhang , Zhuo Jin , Xiaosong Xiong , Qirong Bu , Jingkun Chen , Jun Feng
{"title":"HELPNet: Hierarchical perturbations consistency and entropy-guided ensemble for scribble supervised medical image segmentation","authors":"Xiao Zhang , Shaoxuan Wu , Peilin Zhang , Zhuo Jin , Xiaosong Xiong , Qirong Bu , Jingkun Chen , Jun Feng","doi":"10.1016/j.media.2025.103719","DOIUrl":"10.1016/j.media.2025.103719","url":null,"abstract":"<div><div>Creating fully annotated labels for medical image segmentation is prohibitively time-intensive and costly, emphasizing the necessity for innovative approaches that minimize reliance on detailed annotations. Scribble annotations offer a more cost-effective alternative, significantly reducing the expenses associated with full annotations. However, scribble annotations offer limited and imprecise information, failing to capture the detailed structural and boundary characteristics necessary for accurate organ delineation. To address these challenges, we propose HELPNet, a novel scribble-based weakly supervised segmentation framework, designed to bridge the gap between annotation efficiency and segmentation performance. HELPNet integrates three modules. The Hierarchical perturbations consistency (HPC) module enhances feature learning by employing density-controlled jigsaw perturbations across global, local, and focal views, enabling robust modeling of multi-scale structural representations. Building on this, the Entropy-guided pseudo-label (EGPL) module evaluates the confidence of segmentation predictions using entropy, generating high-quality pseudo-labels. Finally, the Structural prior refinement (SPR) module integrates connectivity analysis and image boundary prior to refine pseudo-label quality and enhance supervision. Experimental results on three public datasets ACDC, MSCMRseg, and CHAOS show that HELPNet significantly outperforms state-of-the-art methods for scribble-based weakly supervised segmentation and achieves performance comparable to fully supervised methods. The code is available at <span><span>https://github.com/IPMI-NWU/HELPNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103719"},"PeriodicalIF":10.7,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144670114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}