Hongyu Wang , Yonghao Long , Yueyao Chen , Hon-Chi Yip , Markus Scheppach , Philip Wai-Yan Chiu , Yeung Yam , Helen Mei-Ling Meng , Qi Dou
{"title":"Learning dissection trajectories from expert surgical videos via imitation learning with equivariant diffusion","authors":"Hongyu Wang , Yonghao Long , Yueyao Chen , Hon-Chi Yip , Markus Scheppach , Philip Wai-Yan Chiu , Yeung Yam , Helen Mei-Ling Meng , Qi Dou","doi":"10.1016/j.media.2025.103599","DOIUrl":"10.1016/j.media.2025.103599","url":null,"abstract":"<div><div>Endoscopic Submucosal Dissection (ESD) constitutes a firmly well-established technique within endoscopic resection for the elimination of epithelial lesions. Dissection trajectory prediction in ESD videos has the potential to strengthen surgical skills training and simplify surgical skills training. However, this approach has been seldom explored in previous research. While imitation learning has proven effective in learning skills from expert demonstrations, it encounters difficulties in predicting uncertain future movements, learning geometric symmetries and generalizing to diverse surgical scenarios. This paper introduces imitation learning for the critical task of predicting dissection trajectories from expert video demonstrations. We propose a novel Implicit Diffusion Policy with Equivariant Representations for Imitation Learning (iDPOE) to address this variability. Our method implicitly models expert behaviors using a joint state–action distribution, capturing the inherent stochasticity of future dissection trajectories and enabling robust visual representation learning across various endoscopic views. By incorporating a diffusion model in policy learning, our approach facilitates efficient training and sampling, resulting in more accurate predictions and improved generalization. Additionally, we integrate equivariance into the learning process to enhance the model’s ability to generalize to geometric symmetries in trajectory prediction. To enable conditional sampling from the implicit policy, we develop a forward-process guided action inference strategy to correct state mismatches. We evaluated our method using a collected ESD video dataset comprising nearly 2000 clips. Experimental results demonstrate that our approach outperforms both explicit and implicit state-of-the-art methods in trajectory prediction. As far as we know, this is the first endeavor to utilize imitation learning-based techniques for surgical skill learning in terms of dissection trajectory prediction.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103599"},"PeriodicalIF":10.7,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rosana El Jurdi , Gaël Varoquaux , Olivier Colliot
{"title":"Confidence intervals for performance estimates in brain MRI segmentation","authors":"Rosana El Jurdi , Gaël Varoquaux , Olivier Colliot","doi":"10.1016/j.media.2025.103565","DOIUrl":"10.1016/j.media.2025.103565","url":null,"abstract":"<div><div>Medical segmentation models are evaluated empirically. As such an evaluation is based on a limited set of example images, it is unavoidably noisy. Beyond a mean performance measure, reporting confidence intervals is thus crucial. However, this is rarely done in medical image segmentation. The width of the confidence interval depends on the test set size and on the spread of the performance measure (its standard-deviation across the test set). For classification, many test images are needed to avoid wide confidence intervals. Segmentation, however, has not been studied, and it differs by the amount of information brought by a given test image. In this paper, we study the typical confidence intervals in the context of segmentation in 3D brain magnetic resonance imaging (MRI). We carry experiments on using the standard nnU-net framework, two datasets from the Medical Decathlon challenge that concern brain MRI (hippocampus and brain tumor segmentation) and two performance measures: the Dice Similarity Coefficient and the Hausdorff distance. We show that the parametric confidence intervals are reasonable approximations of the bootstrap estimates for varying test set sizes and spread of the performance metric. Importantly, we show that the test size needed to achieve a given precision is often much lower than for classification tasks. Typically, a 1% wide confidence interval requires about 100–200 test samples when the spread is low (standard-deviation around 3%). More difficult segmentation tasks may lead to higher spreads and require over 1000 samples. The corresponding code and notebooks are available on GitHub at <span><span>https://github.com/rosanajurdi/SegVal_Repo</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103565"},"PeriodicalIF":10.7,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yajie Zhang , Yu-An Huang , Yao Hu , Rui Liu , Jibin Wu , Zhi-An Huang , Kay Chen Tan
{"title":"CausalMixNet: A mixed-attention framework for causal intervention in robust medical image diagnosis","authors":"Yajie Zhang , Yu-An Huang , Yao Hu , Rui Liu , Jibin Wu , Zhi-An Huang , Kay Chen Tan","doi":"10.1016/j.media.2025.103581","DOIUrl":"10.1016/j.media.2025.103581","url":null,"abstract":"<div><div>Confounding factors inherent in medical images can significantly impact the causal exploration capabilities of deep learning models, resulting in compromised accuracy and diminished generalization performance. In this paper, we present an innovative methodology named CausalMixNet that employs query-mixed intra-attention and key&value-mixed inter-attention to probe causal relationships between input images and labels. For mitigating unobservable confounding factors, CausalMixNet integrates the non-local reasoning module (NLRM) and the key&value-mixed inter-attention (KVMIA) to conduct a front-door adjustment strategy. Furthermore, CausalMixNet incorporates a patch-masked ranking module (PMRM) and query-mixed intra-attention (QMIA) to enhance mediator learning, thereby facilitating causal intervention. The patch mixing mechanism applied to query/(key&value) features within QMIA and KVMIA specifically targets lesion-related feature enhancement and the inference of average causal effect inference. CausalMixNet consistently outperforms existing methods, achieving superior accuracy and F1-scores across in-domain and out-of-domain scenarios on multiple datasets, with an average improvement of 3% over the closest competitor. Demonstrating robustness against noise, gender bias, and attribute bias, CausalMixNet excels in handling unobservable confounders, maintaining stable performance even in challenging conditions.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103581"},"PeriodicalIF":10.7,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144064023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SegQC: a segmentation network-based framework for multi-metric segmentation quality control and segmentation error detection in volumetric medical images","authors":"Bella Specktor-Fadida , Liat Ben-Sira , Dafna Ben-Bashat , Leo Joskowicz","doi":"10.1016/j.media.2025.103638","DOIUrl":"10.1016/j.media.2025.103638","url":null,"abstract":"<div><div>Quality control (QC) of structures segmentation in volumetric medical images is important for identifying segmentation errors in clinical practice and for facilitating model development by enhancing network performance in semi-supervised and active learning scenarios. This paper introduces SegQC, a novel framework for segmentation quality estimation and segmentation error detection. SegQC computes an estimate measure of the quality of a segmentation in volumetric scans and in their individual slices and identifies possible segmentation error regions within a slice. The key components of SegQC include: 1) SegQC<img>Net, a deep network that inputs a scan and its segmentation mask and outputs segmentation error probabilities for each voxel in the scan; 2) three new segmentation quality metrics computed from the segmentation error probabilities; 3) a new method for detecting possible segmentation errors in scan slices computed from the segmentation error probabilities. We introduce a novel evaluation scheme to measure segmentation error discrepancies based on an expert radiologist’s corrections of automatically produced segmentations that yields smaller observer variability and is closer to actual segmentation errors. We demonstrate SegQC on three fetal structures in 198 fetal MRI scans – fetal brain, fetal body and the placenta. To assess the benefits of SegQC, we compare it to the unsupervised Test Time Augmentation (TTA)-based QC and to supervised autoencoder (AE)-based QC. Our studies indicate that SegQC outperforms TTA-based quality estimation for whole scans and individual slices in terms of Pearson correlation and MAE for fetal body and fetal brain structures segmentation as well as for volumetric overlap metrics estimation of the placenta structure. Compared to both unsupervised TTA and supervised AE methods, SegQC achieves lower MAE for both 3D and 2D Dice estimates and higher Pearson correlation for volumetric Dice. Our segmentation error detection method achieved recall and precision rates of 0.77 and 0.48 for fetal body, and 0.74 and 0.55 for fetal brain segmentation error detection, respectively. Ranking derived from metrics estimation surpasses rankings based on entropy and sum for TTA and SegQC<img>Net estimations, respectively. SegQC provides high-quality metrics estimation for both 2D and 3D medical images as well as error localization within slices, offering important improvements to segmentation QC.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103638"},"PeriodicalIF":10.7,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Soumick Chatterjee , Franziska Gaidzik , Alessandro Sciarra , Hendrik Mattern , Gábor Janiga , Oliver Speck , Andreas Nürnberger , Sahani Pathiraja
{"title":"PULASki: Learning inter-rater variability using statistical distances to improve probabilistic segmentation","authors":"Soumick Chatterjee , Franziska Gaidzik , Alessandro Sciarra , Hendrik Mattern , Gábor Janiga , Oliver Speck , Andreas Nürnberger , Sahani Pathiraja","doi":"10.1016/j.media.2025.103623","DOIUrl":"10.1016/j.media.2025.103623","url":null,"abstract":"<div><div>In the domain of medical imaging, many supervised learning based methods for segmentation face several challenges such as high variability in annotations from multiple experts, paucity of labelled data and class imbalanced datasets. These issues may result in segmentations that lack the requisite precision for clinical analysis and can be misleadingly overconfident without associated uncertainty quantification. This work proposes the PULASki method as a computationally efficient generative tool for biomedical image segmentation that accurately captures variability in expert annotations, even in small datasets. This approach makes use of an improved loss function based on statistical distances in a conditional variational autoencoder structure (Probabilistic UNet) , which improves learning of the conditional decoder compared to the standard cross-entropy particularly in class imbalanced problems. The proposed method was analysed for two structurally different segmentation tasks (intracranial vessel and multiple sclerosis (MS) lesion) and compare our results to four well-established baselines in terms of quantitative metrics and qualitative output. These experiments involve class-imbalanced datasets characterised by challenging features, including suboptimal signal-to-noise ratios and high ambiguity. Empirical results demonstrate the PULASKi method outperforms all baselines at the 5% significance level. Our experiments are also of the first to present a comparative study of the computationally feasible segmentation of complex geometries using 3D patches and the traditional use of 2D slices. The generated segmentations are shown to be much more anatomically plausible than in the 2D case, particularly for the vessel task. Our method can also be applied to a wide range of multi-label segmentation tasks and is useful for downstream tasks such as hemodynamic modelling (computational fluid dynamics and data assimilation), clinical decision making, and treatment planning.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103623"},"PeriodicalIF":10.7,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Lin , Dong Zhang , Xiao Fang , Yufan Chen , Kwang-Ting Cheng , Hao Chen
{"title":"Rethinking boundary detection in deep learning-based medical image segmentation","authors":"Yi Lin , Dong Zhang , Xiao Fang , Yufan Chen , Kwang-Ting Cheng , Hao Chen","doi":"10.1016/j.media.2025.103615","DOIUrl":"10.1016/j.media.2025.103615","url":null,"abstract":"<div><div>Medical image segmentation is a pivotal task within the realms of medical image analysis and computer vision. While current methods have shown promise in accurately segmenting major regions of interest, the precise segmentation of boundary areas remains challenging. In this study, we propose a novel network architecture named CTO, which combines Convolutional Neural Networks (CNNs), Vision Transformer (ViT) models, and explicit edge detection operators to tackle this challenge. CTO surpasses existing methods in terms of segmentation accuracy and strikes a better balance between accuracy and efficiency, without the need for additional data inputs or label injections. Specifically, CTO adheres to the canonical encoder–decoder network paradigm, with a dual-stream encoder network comprising a mainstream CNN stream for capturing local features and an auxiliary StitchViT stream for integrating long-range dependencies. Furthermore, to enhance the model’s ability to learn boundary areas, we introduce a boundary-guided decoder network that employs binary boundary masks generated by dedicated edge detection operators to provide explicit guidance during the decoding process. We validate the performance of CTO through extensive experiments conducted on seven challenging medical image segmentation datasets, namely ISIC 2016, PH2, ISIC 2018, CoNIC, LiTS17, BraTS, and BTCV. Our experimental results unequivocally demonstrate that CTO achieves state-of-the-art accuracy on these datasets while maintaining competitive model complexity. The codes have been released at: <span><span>CTO</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103615"},"PeriodicalIF":10.7,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143916897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chuixing Wu , Jincheng Xie , Fangrong Liang , Weixiong Zhong , Ruimeng Yang , Yuankui Wu , Tao Liang , Linjing Wang , Xin Zhen
{"title":"REPAIR: Reciprocal assistance imputation-representation learning for glioma diagnosis with incomplete MRI sequences","authors":"Chuixing Wu , Jincheng Xie , Fangrong Liang , Weixiong Zhong , Ruimeng Yang , Yuankui Wu , Tao Liang , Linjing Wang , Xin Zhen","doi":"10.1016/j.media.2025.103634","DOIUrl":"10.1016/j.media.2025.103634","url":null,"abstract":"<div><div>The absence of MRI sequences is a common occurrence in clinical practice, posing a significant challenge for prediction modeling of non-invasive diagnosis of glioma (GM) via fusion of multi-sequence MRI. To address this issue, we propose a novel unified reciprocal assistance imputation-representation learning framework (namely REPAIR) for GM diagnosis modeling with incomplete MRI sequences. REPAIR facilitates a cooperative process between missing value imputation and multi-sequence MRI fusion by leveraging existing samples to inform the imputation of missing values. This, in turn, facilitates the learning of a shared latent representation, which reciprocally guides more accurate imputation of missing values. To tailor the learned representation for downstream tasks, a novel ambiguity-aware intercorrelation regularization is introduced to equip REPAIR by correlating imputation ambiguity and its impacts conveying to the learned representation via a fuzzy paradigm. Additionally, a multimodal structural calibration constraint is devised to correct for the structural shift caused by missing data, ensuring structural consistency between the learned representations and the actual data. The proposed methodology is extensively validated on eight GM datasets with incomplete MRI sequences and six clinical datasets from other diseases with incomplete imaging modalities. Comprehensive comparisons with state-of-the-art methods have demonstrated the competitiveness of our approach for GM diagnosis with incomplete MRI sequences, as well as its potential for generalization to various diseases with missing imaging modalities.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103634"},"PeriodicalIF":10.7,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Robert Spektor , Tom Friedman , Itay Or , Gil Bolotin , Shlomi Laufer
{"title":"Monocular pose estimation of articulated open surgery tools - in the wild","authors":"Robert Spektor , Tom Friedman , Itay Or , Gil Bolotin , Shlomi Laufer","doi":"10.1016/j.media.2025.103618","DOIUrl":"10.1016/j.media.2025.103618","url":null,"abstract":"<div><div>This work presents a framework for monocular 6D pose estimation of surgical instruments in open surgery, addressing challenges such as object articulations, specularity, occlusions, and synthetic-to-real domain adaptation. The proposed approach consists of three main components: <span><math><mrow><mo>(</mo><mn>1</mn><mo>)</mo></mrow></math></span> synthetic data generation pipeline that incorporates 3D scanning of surgical tools with articulation rigging and physically-based rendering; <span><math><mrow><mo>(</mo><mn>2</mn><mo>)</mo></mrow></math></span> a tailored pose estimation framework combining tool detection with pose and articulation estimation; and <span><math><mrow><mo>(</mo><mn>3</mn><mo>)</mo></mrow></math></span> a training strategy on synthetic and real unannotated video data, employing domain adaptation with automatically generated pseudo-labels. Evaluations conducted on real data of open surgery demonstrate the good performance and real-world applicability of the proposed framework, highlighting its potential for integration into medical augmented reality and robotic systems. The approach eliminates the need for extensive manual annotation of real surgical data.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103618"},"PeriodicalIF":10.7,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143971799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lavsen Dahal , Mobina Ghojoghnejad , Liesbeth Vancoillie , Dhrubajyoti Ghosh , Yubraj Bhandari , David Kim , Fong Chi Ho , Fakrul Islam Tushar , Sheng Luo , Kyle J. Lafata , Ehsan Abadi , Ehsan Samei , Joseph Y. Lo , W. Paul Segars
{"title":"XCAT 3.0: A comprehensive library of personalized digital twins derived from CT scans","authors":"Lavsen Dahal , Mobina Ghojoghnejad , Liesbeth Vancoillie , Dhrubajyoti Ghosh , Yubraj Bhandari , David Kim , Fong Chi Ho , Fakrul Islam Tushar , Sheng Luo , Kyle J. Lafata , Ehsan Abadi , Ehsan Samei , Joseph Y. Lo , W. Paul Segars","doi":"10.1016/j.media.2025.103636","DOIUrl":"10.1016/j.media.2025.103636","url":null,"abstract":"<div><div>Virtual Imaging Trials (VIT) offer a cost-effective and scalable approach for evaluating medical imaging technologies. Computational phantoms, which mimic real patient anatomy and physiology, play a central role in VITs. However, the current libraries of computational phantoms face limitations, particularly in terms of sample size and heterogeneity. Insufficient representation of the population hampers accurate assessment of imaging technologies across different patient groups. Traditionally, the more realistic computational phantoms were created by manual segmentation, which is a laborious and time-consuming task, impeding the expansion of phantom libraries. This study presents a framework for creating realistic computational phantoms using a suite of automatic segmentation models and performing three forms of automated quality control on the segmented organ masks. The result is the release of over 2500 new XCAT 3 generation of computational phantoms. This new formation embodies 140 structures and represents a comprehensive approach to detailed anatomical modeling. The developed computational phantoms are formatted in both voxelized and surface mesh formats. The framework is combined with an in-house CT scanner simulator to produce realistic CT images. The framework has the potential to advance virtual imaging trials, facilitating comprehensive and reliable evaluations of medical imaging technologies. Phantoms may be requested at <span><span>https://cvit.duke.edu/resources/</span><svg><path></path></svg></span>. Code, model weights, and sample CT images are available at <span><span>https://xcat-3.github.io/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103636"},"PeriodicalIF":10.7,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143927479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Julio Silva-Rodríguez , Jose Dolz , Ismail Ben Ayed
{"title":"Towards Foundation Models and Few-Shot Parameter-Efficient Fine-Tuning for Volumetric Organ Segmentation","authors":"Julio Silva-Rodríguez , Jose Dolz , Ismail Ben Ayed","doi":"10.1016/j.media.2025.103596","DOIUrl":"10.1016/j.media.2025.103596","url":null,"abstract":"<div><div>The recent popularity of foundation models and the pre-train-and-adapt paradigm, where a large-scale model is transferred to downstream tasks, is gaining attention for volumetric medical image segmentation. However, current transfer learning strategies devoted to full fine-tuning for transfer learning may require significant resources and yield sub-optimal results when the labeled data of the target task is scarce. This makes its applicability in real clinical settings challenging since these institutions are usually constrained on data and computational resources to develop proprietary solutions. To address this challenge, we formalize Few-Shot Efficient Fine-Tuning (FSEFT), a novel and realistic scenario for adapting medical image segmentation foundation models. This setting considers the key role of both data- and parameter-efficiency during adaptation. Building on a foundation model pre-trained on open-access CT organ segmentation sources, we propose leveraging Parameter-Efficient Fine-Tuning and black-box Adapters to address such challenges. Furthermore, novel efficient adaptation methodologies are introduced in this work, which include Spatial black-box Adapters that are more appropriate for dense prediction tasks and constrained transductive inference, leveraging task-specific prior knowledge. Our comprehensive transfer learning experiments confirm the suitability of foundation models in medical image segmentation and unveil the limitations of popular fine-tuning strategies in few-shot scenarios. The project code is available: <span><span>https://github.com/jusiro/fewshot-finetuning</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103596"},"PeriodicalIF":10.7,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143923639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}