Zihao Zhao , Yuxiao Liu , Han Wu , Mei Wang , Yonghao Li , Sheng Wang , Lin Teng , Disheng Liu , Zhiming Cui , Qian Wang , Dinggang Shen
{"title":"CLIP in medical imaging: A survey","authors":"Zihao Zhao , Yuxiao Liu , Han Wu , Mei Wang , Yonghao Li , Sheng Wang , Lin Teng , Disheng Liu , Zhiming Cui , Qian Wang , Dinggang Shen","doi":"10.1016/j.media.2025.103551","DOIUrl":"10.1016/j.media.2025.103551","url":null,"abstract":"<div><div>Contrastive Language-Image Pre-training (CLIP), a simple yet effective pre-training paradigm, successfully introduces text supervision to vision models. It has shown promising results across various tasks due to its generalizability and interpretability. The use of CLIP has recently gained increasing interest in the medical imaging domain, serving as a pre-training paradigm for image–text alignment, or a critical component in diverse clinical tasks. With the aim of facilitating a deeper understanding of this promising direction, this survey offers an in-depth exploration of the CLIP within the domain of medical imaging, regarding both refined CLIP pre-training and CLIP-driven applications. In this paper, we (1) first start with a brief introduction to the fundamentals of CLIP methodology; (2) then investigate the adaptation of CLIP pre-training in the medical imaging domain, focusing on how to optimize CLIP given characteristics of medical images and reports; (3) further explore practical utilization of CLIP pre-trained models in various tasks, including classification, dense prediction, and cross-modal tasks; and (4) finally discuss existing limitations of CLIP in the context of medical imaging, and propose forward-looking directions to address the demands of medical imaging domain. Studies featuring technical and practical value are both investigated. We expect this survey will provide researchers with a holistic understanding of the CLIP paradigm and its potential implications. The project page of this survey can also be found on <span><span>Github</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"102 ","pages":"Article 103551"},"PeriodicalIF":10.7,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143675536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing source-free domain adaptation in Medical Image Segmentation via regulated model self-training","authors":"Tianwei Zhang , Kang Li , Shi Gu , Pheng-Ann Heng","doi":"10.1016/j.media.2025.103543","DOIUrl":"10.1016/j.media.2025.103543","url":null,"abstract":"<div><div>Source-free domain adaptation (SFDA) has drawn increasing attention lately in the medical field. It aims to adapt a model well trained on source domain to target domains without accessing source domain data nor requiring target domain labels, to enable privacy-protecting and annotation-efficient domain adaptation. Most SFDA approaches initialize the target model with source model weights, and guide model self-training with the pseudo-labels generated from the source model. However, when source and target domains have huge discrepancies (<em>e.g.</em>, different modalities), the obtained pseudo-labels would be of poor quality. Different from prior works that overcome it by refining pseudo-labels to better quality, in this work, we try to explore it from the perspective of knowledge transfer. We recycle the beneficial domain-invariant prior knowledge in the source model, and refresh its domain-specific knowledge from source-specific to target-specific, to help the model satisfyingly tackle target domains even when facing severe domain shifts. To achieve it, we proposed a regulated model self-training framework. For high-transferable domain-invariant parameters, we constrain their update magnitude from large changes, to secure the domain-shared priors from going stray and let it continuously facilitate target domain adaptation. For the low-transferable domain-specific parameters, we actively update them to let the domain-specific embedding become target-specific. Regulating them together, the model would develop better capability for target data even under severe domain shifts. Importantly, the proposed approach could seamlessly collaborate with existing pseudo-label refinement approaches to bring more performance gains. We have extensively validated our framework under significant domain shifts in 3D cross-modality cardiac segmentation, and under minor domain shifts in 2D cross-vendor fundus segmentation, respectively. Our approach consistently outperformed the competing methods and achieved superior performance.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"102 ","pages":"Article 103543"},"PeriodicalIF":10.7,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143714209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Nan , Huichi Zhou , Xiaodan Xing , Giorgos Papanastasiou , Lei Zhu , Zhifan Gao , Alejandro F. Frangi , Guang Yang
{"title":"Revisiting medical image retrieval via knowledge consolidation","authors":"Yang Nan , Huichi Zhou , Xiaodan Xing , Giorgos Papanastasiou , Lei Zhu , Zhifan Gao , Alejandro F. Frangi , Guang Yang","doi":"10.1016/j.media.2025.103553","DOIUrl":"10.1016/j.media.2025.103553","url":null,"abstract":"<div><div>As artificial intelligence and digital medicine increasingly permeate healthcare systems, robust governance frameworks are essential to ensure ethical, secure, and effective implementation. In this context, medical image retrieval becomes a critical component of clinical data management, playing a vital role in decision-making and safeguarding patient information. Existing methods usually learn hash functions using bottleneck features, which fail to produce representative hash codes from blended embeddings. Although contrastive hashing has shown superior performance, current approaches often treat image retrieval as a classification task, using category labels to create positive/negative pairs. Moreover, many methods fail to address the out-of-distribution (OOD) issue when models encounter external OOD queries or adversarial attacks. In this work, we propose a novel method to consolidate knowledge of hierarchical features and optimization functions. We formulate the knowledge consolidation by introducing Depth-aware Representation Fusion (DaRF) and Structure-aware Contrastive Hashing (SCH). DaRF adaptively integrates shallow and deep representations into blended features, and SCH incorporates image fingerprints to enhance the adaptability of positive/negative pairings. These blended features further facilitate OOD detection and content-based recommendation, contributing to a secure AI-driven healthcare environment. Moreover, we present a content-guided ranking to improve the robustness and reproducibility of retrieval results. Our comprehensive assessments demonstrate that the proposed method could effectively recognize OOD samples and significantly outperform existing approaches in medical image retrieval (p<span><math><mrow><mi><</mi><mn>0</mn><mo>.</mo><mn>05</mn></mrow></math></span>). In particular, our method achieves a 5.6–38.9% improvement in mean Average Precision on the anatomical radiology dataset.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"102 ","pages":"Article 103553"},"PeriodicalIF":10.7,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143675549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jing Zou , Lanqing Liu , Qi Chen , Shujun Wang , Zhanli Hu , Xiaohan Xing , Jing Qin
{"title":"MMR-Mamba: Multi-modal MRI reconstruction with Mamba and spatial-frequency information fusion","authors":"Jing Zou , Lanqing Liu , Qi Chen , Shujun Wang , Zhanli Hu , Xiaohan Xing , Jing Qin","doi":"10.1016/j.media.2025.103549","DOIUrl":"10.1016/j.media.2025.103549","url":null,"abstract":"<div><div>Multi-modal MRI offers valuable complementary information for diagnosis and treatment; however, its clinical utility is limited by prolonged scanning time. To accelerate the acquisition process, a practical approach is to reconstruct images of the target modality, which requires longer scanning time, from under-sampled k-space data using the fully-sampled reference modality with shorter scanning time as guidance. The primary challenge of this task lies in comprehensively and efficiently integrating complementary information from different modalities to achieve high-quality reconstruction. Existing methods struggle with this challenge: (1) convolution-based models fail to capture long-range dependencies; (2) transformer-based models, while excelling in global feature modeling, suffer from quadratic computational complexity. To address this dilemma, we propose MMR-Mamba, a novel framework that thoroughly and efficiently integrates multi-modal features for MRI reconstruction, leveraging Mamba’s capability to capture long-range dependencies with linear computational complexity while exploiting global properties of the Fourier domain. Specifically, we first design a <em>Target modality-guided Cross Mamba</em> (TCM) module in the spatial domain, which maximally restores the target modality information by selectively incorporating relevant information from the reference modality. Then, we introduce a <em>Selective Frequency Fusion</em> (SFF) module to efficiently integrate global information in the Fourier domain and recover high-frequency signals for the reconstruction of structural details. Furthermore, we devise an <em>Adaptive Spatial-Frequency Fusion</em> (ASFF) module, which mutually enhances the spatial and frequency domains by supplementing less informative channels from one domain with corresponding channels from the other. Extensive experiments on the BraTS and fastMRI knee datasets demonstrate the superiority of our MMR-Mamba over state-of-the-art reconstruction methods. The code is publicly available at <span><span>https://github.com/zoujing925/MMR-Mamba</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"102 ","pages":"Article 103549"},"PeriodicalIF":10.7,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143675569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yan Guo , Chenyao Li , Rong Yang , Puxun Tu , Bolun Zeng , Jiannan Liu , Tong Ji , Chenping Zhang , Xiaojun Chen
{"title":"Automated planning of mandible reconstruction with fibula free flap based on shape completion and morphometric descriptors","authors":"Yan Guo , Chenyao Li , Rong Yang , Puxun Tu , Bolun Zeng , Jiannan Liu , Tong Ji , Chenping Zhang , Xiaojun Chen","doi":"10.1016/j.media.2025.103544","DOIUrl":"10.1016/j.media.2025.103544","url":null,"abstract":"<div><div>Vascularized fibula free flap (FFF) grafts are frequently used to reconstruct mandibular defects. However, the current planning methods for osteotomy, splicing, and fibula placement present challenges in achieving satisfactory facial aesthetics and restoring the original morphology of the mandible. In this study, we propose a novel two-step framework for automated preoperative planning in FFF mandibular reconstruction. The framework is based on mandibular shape completion and morphometric descriptors. Firstly, we utilize a 3D generative model to estimate the entire mandibular geometry by incorporating shape priors and accounting for partial defect mandibles. Accurately predicting the premorbid morphology of the mandible is crucial for determining the surgical plan. Secondly, we introduce new two-dimensional morphometric descriptors to assess the quantitative difference between the planning scheme and the full morphology of the mandible. We have designed intuitive and valid variables specifically designed to describe the planning scheme and constructed an objective function to measure the difference. By optimizing this function, we can achieve the best shape-matched 3D planning solution. Through a retrospective study involving 65 real tumor patients, our method has exhibited favorable results in both qualitative and quantitative analyses when compared to the planned results of experienced clinicians using existing methods. This demonstrates that our method can implement an automated preoperative planning technique, eliminating subjectivity and achieving user-independent results. Furthermore, we have presented the potential of our automated planning process in a clinical case, highlighting its applicability in clinical settings.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"102 ","pages":"Article 103544"},"PeriodicalIF":10.7,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143685374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianshu Zheng , Chuyang Ye , Zhaopeng Cui , Hui Zhang , Daniel C. Alexander , Dan Wu
{"title":"An extragradient and noise-tuning adaptive iterative network for diffusion MRI-based microstructural estimation","authors":"Tianshu Zheng , Chuyang Ye , Zhaopeng Cui , Hui Zhang , Daniel C. Alexander , Dan Wu","doi":"10.1016/j.media.2025.103535","DOIUrl":"10.1016/j.media.2025.103535","url":null,"abstract":"<div><div>Diffusion MRI (dMRI) is a powerful technique for investigating tissue microstructure properties. However, advanced dMRI models are typically complex and nonlinear, requiring a large number of acquisitions in the <em>q</em>-space. Deep learning techniques, specifically optimization-based networks, have been proposed to improve the model fitting with limited <em>q</em>-space data. Previous optimization procedures relied on the empirical selection of iteration block numbers and the network structures were based on the <em>iterative hard thresholding</em> (IHT) algorithm, which may suffer from instability during sparse reconstruction. In this study, we introduced an <em>extragradient and noise-tuning adaptive iterative network</em>, a generic network for estimating dMRI model parameters. We proposed an adaptive mechanism that flexibly adjusts the sparse representation process, depending on specific dMRI models, datasets, and downsampling strategies, avoiding manual selection and accelerating inference. In addition, we proposed a noise-tuning module to assist the network in escaping from local minimum/saddle points. The network also included an additional projection of the extragradient to ensure its convergence. We evaluated the performance of the proposed network on the <em>neurite orientation dispersion and density imaging</em> (NODDI) model and <em>diffusion basis spectrum imaging</em> (DBSI) model on two 3T <em>Human Connectome Project</em> (HCP) datasets and a 7T HCP dataset with six different downsampling strategies. The proposed framework demonstrated superior accuracy and generalizability compared to other state-of-the-art microstructural estimation algorithms.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"102 ","pages":"Article 103535"},"PeriodicalIF":10.7,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143714210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"5D image reconstruction exploiting space-motion-echo sparsity for accelerated free-breathing quantitative liver MRI","authors":"MungSoo Kang , Ricardo Otazo , Gerald Behr , Youngwook Kee","doi":"10.1016/j.media.2025.103532","DOIUrl":"10.1016/j.media.2025.103532","url":null,"abstract":"<div><div>Recent advances in 3D non-Cartesian multi-echo gradient-echo (mGRE) imaging and compressed sensing (CS)-based 4D (3D image space + 1D respiratory motion) motion-resolved image reconstruction, which applies temporal total variation to the respiratory motion dimension, have enabled free-breathing liver tissue MR parameter mapping. This technology now allows for robust reconstruction of high-resolution proton density fat fraction (PDFF), R<span><math><msubsup><mrow></mrow><mrow><mn>2</mn></mrow><mrow><mo>∗</mo></mrow></msubsup></math></span>, and quantitative susceptibility mapping (QSM), previously unattainable with conventional Cartesian mGRE imaging. However, long scan times remain a persistent challenge in free-breathing 3D non-Cartesian mGRE imaging. Recognizing that the underlying dimension of the imaging data is essentially 5D (4D + 1D echo signal evolution), we propose a CS-based 5D motion-resolved mGRE image reconstruction method to further accelerate the acquisition. Our approach integrates discrete wavelet transforms along the echo and spatial dimensions into a CS-based reconstruction model and devises a solution algorithm capable of handling such a 5D complex-valued array. Through phantom and in vivo human subject studies, we evaluated the effectiveness of leveraging unexplored correlations by comparing the proposed 5D reconstruction with the 4D reconstruction (i.e., motion-resolved reconstruction with temporal total variation) across a wide range of acceleration factors. The 5D reconstruction produced more reliable and consistent measurements of PDFF, R<span><math><msubsup><mrow></mrow><mrow><mn>2</mn></mrow><mrow><mo>∗</mo></mrow></msubsup></math></span>, and QSM compared to the 4D reconstruction. In conclusion, the proposed 5D motion-resolved image reconstruction demonstrates the feasibility of achieving accelerated, reliable, and free-breathing liver mGRE imaging for the measurement of PDFF, R<span><math><msubsup><mrow></mrow><mrow><mn>2</mn></mrow><mrow><mo>∗</mo></mrow></msubsup></math></span>, and QSM.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"102 ","pages":"Article 103532"},"PeriodicalIF":10.7,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143675551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junde Wu , Ziyue Wang , Mingxuan Hong , Wei Ji , Huazhu Fu , Yanwu Xu , Min Xu , Yueming Jin
{"title":"Medical SAM adapter: Adapting segment anything model for medical image segmentation","authors":"Junde Wu , Ziyue Wang , Mingxuan Hong , Wei Ji , Huazhu Fu , Yanwu Xu , Min Xu , Yueming Jin","doi":"10.1016/j.media.2025.103547","DOIUrl":"10.1016/j.media.2025.103547","url":null,"abstract":"<div><div>The Segment Anything Model (SAM) has recently gained popularity in the field of image segmentation due to its impressive capabilities in various segmentation tasks and its prompt-based interface. However, recent studies and individual experiments have shown that SAM underperforms in medical image segmentation due to the lack of medical-specific knowledge. This raises the question of how to enhance SAM’s segmentation capability for medical images. We propose the Medical SAM Adapter (Med-SA), which is one of the first methods to integrate SAM into medical image segmentation. Med-SA uses a light yet effective adaptation technique instead of fine-tuning the SAM model, incorporating domain-specific medical knowledge into the segmentation model. We also propose Space-Depth Transpose (SD-Trans) to adapt 2D SAM to 3D medical images and Hyper-Prompting Adapter (HyP-Adpt) to achieve prompt-conditioned adaptation. Comprehensive evaluation experiments on 17 medical image segmentation tasks across various modalities demonstrate the superior performance of Med-SA while updating only 2% of the SAM parameters (13M). Our code is released at <span><span>https://github.com/KidsWithTokens/Medical-SAM-Adapter</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"102 ","pages":"Article 103547"},"PeriodicalIF":10.7,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143675535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xingyu Ai , Bin Huang , Fang Chen , Liu Shi , Binxuan Li , Shaoyu Wang , Qiegen Liu
{"title":"RED: Residual estimation diffusion for low-dose PET sinogram reconstruction","authors":"Xingyu Ai , Bin Huang , Fang Chen , Liu Shi , Binxuan Li , Shaoyu Wang , Qiegen Liu","doi":"10.1016/j.media.2025.103558","DOIUrl":"10.1016/j.media.2025.103558","url":null,"abstract":"<div><div>Recent advances in diffusion models have demonstrated exceptional performance in generative tasks across various fields. In positron emission tomography (PET), the reduction in tracer dose leads to information loss in sinograms. Using diffusion models to reconstruct missing information can improve imaging quality. Traditional diffusion models effectively use Gaussian noise for image reconstructions. However, in low-dose PET reconstruction, Gaussian noise can worsen the already sparse data by introducing artifacts and inconsistencies. To address this issue, we propose a diffusion model named residual estimation diffusion (RED). From the perspective of diffusion mechanism, RED uses the residual between sinograms to replace Gaussian noise in diffusion process, respectively sets the low-dose and full-dose sinograms as the starting point and endpoint of reconstruction. This mechanism helps preserve the original information in the low-dose sinogram, thereby enhancing reconstruction reliability. From the perspective of data consistency, RED introduces a drift correction strategy to reduce accumulated prediction errors during the reverse process. Calibrating the intermediate results of reverse iterations helps maintain the data consistency and enhances the stability of reconstruction process. In the experiments, RED achieved the best performance across all metrics. Specifically, the PSNR metric showed improvements of 2.75, 5.45, and 8.08 dB in DRF4, 20, and 100 respectively, compared to traditional methods. The code is available at: <span><span>https://github.com/yqx7150/RED</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"102 ","pages":"Article 103558"},"PeriodicalIF":10.7,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143675552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xueyu Liu , Guangze Shi , Rui Wang , Yexin Lai , Jianan Zhang , Weixia Han , Min Lei , Ming Li , Xiaoshuang Zhou , Yongfei Wu , Chen Wang , Wen Zheng
{"title":"Segment Any Tissue: One-shot reference guided training-free automatic point prompting for medical image segmentation","authors":"Xueyu Liu , Guangze Shi , Rui Wang , Yexin Lai , Jianan Zhang , Weixia Han , Min Lei , Ming Li , Xiaoshuang Zhou , Yongfei Wu , Chen Wang , Wen Zheng","doi":"10.1016/j.media.2025.103550","DOIUrl":"10.1016/j.media.2025.103550","url":null,"abstract":"<div><div>Medical image segmentation frequently encounters high annotation costs and challenges in task adaptation. While visual foundation models have shown promise in natural image segmentation, automatically generating high-quality prompts for class-agnostic segmentation of medical images remains a significant practical challenge. To address these challenges, we present Segment Any Tissue (SAT), an innovative, training-free framework designed to automatically prompt the class-agnostic visual foundation model for the segmentation of medical images with only a one-shot reference. SAT leverages the robust feature-matching capabilities of a pretrained foundation model to construct distance metrics in the feature space. By integrating these with distance metrics in the physical space, SAT establishes a dual-space cyclic prompt engineering approach for automatic prompt generation, optimization, and evaluation. Subsequently, SAT utilizes a class-agnostic foundation segmentation model with the generated prompt scheme to obtain segmentation results. Additionally, we extend the one-shot framework by incorporating multiple reference images to construct an ensemble SAT, further enhancing segmentation performance. SAT has been validated on six public and private medical segmentation tasks, capturing both macroscopic and microscopic perspectives across multiple dimensions. In the ablation experiments, automatic prompt selection enabled SAT to effectively handle tissues of various sizes, while also validating the effectiveness of each component. The comparative experiments show that SAT is comparable to, or even exceeds, some fully supervised methods. It also demonstrates superior performance compared to existing one-shot methods. In summary, SAT requires only a single pixel-level annotated reference image to perform tissue segmentation across various medical images in a training-free manner. This not only significantly reduces the annotation costs of applying foundational models to the medical field but also enhances task transferability, providing a foundation for the clinical application of intelligent medicine. Our source code is available at <span><span>https://github.com/SnowRain510/Segment-Any-Tissue</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"102 ","pages":"Article 103550"},"PeriodicalIF":10.7,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143675554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}