Ruize Cui , Lanqing Liu , Jing Zou , Xiaowei Hu , Jialun Pei , Jing Qin
{"title":"Taming large vision model for medical image segmentation via Dual Visual Prompt Tuning","authors":"Ruize Cui , Lanqing Liu , Jing Zou , Xiaowei Hu , Jialun Pei , Jing Qin","doi":"10.1016/j.compmedimag.2025.102608","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents Dual Visual Prompt Tuning (DVPT), an innovative strategy to enhance the performance of the Segment Anything Model (SAM) for medical image segmentation. While SAM demonstrates robust generalization in natural image segmentation, its effectiveness in medical tasks is hindered by the distinct characteristics of medical targets, the presence of noise and artifacts, and insufficient task-specific data for fine-tuning. Moreover, the manual-prompting paradigm applied in SAM make it laborious when adapted to medical domain. To address these challenges, DVPT employs an fully automatic prompting paradigm and assembles both image-specific local and global guidance into SAM through two components: the <em>Local Feature Prompt Tuning (LFPT)</em> module, which enhances local information capture of detailed anatomical structures, and the <em>Global Guiding Prompt (GGP)</em> encoder, which mitigates noise interference and strengthens the identification of ambiguous boundaries within medical images. By integrating both local and global prompts within the mask decoder, the proposed DVPT yields superior segmentation accuracy. Experimental results across three medical image segmentation tasks consistently demonstrate that our method outperforms current state-of-the-art approaches. Our method significantly contributes to accurate and impactful computer-assisted diagnostics, promoting advancements in healthcare solutions. Our code can be available at <span><span>https://github.com/cuiruize/DVPT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"124 ","pages":"Article 102608"},"PeriodicalIF":4.9000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computerized Medical Imaging and Graphics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S089561112500117X","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents Dual Visual Prompt Tuning (DVPT), an innovative strategy to enhance the performance of the Segment Anything Model (SAM) for medical image segmentation. While SAM demonstrates robust generalization in natural image segmentation, its effectiveness in medical tasks is hindered by the distinct characteristics of medical targets, the presence of noise and artifacts, and insufficient task-specific data for fine-tuning. Moreover, the manual-prompting paradigm applied in SAM make it laborious when adapted to medical domain. To address these challenges, DVPT employs an fully automatic prompting paradigm and assembles both image-specific local and global guidance into SAM through two components: the Local Feature Prompt Tuning (LFPT) module, which enhances local information capture of detailed anatomical structures, and the Global Guiding Prompt (GGP) encoder, which mitigates noise interference and strengthens the identification of ambiguous boundaries within medical images. By integrating both local and global prompts within the mask decoder, the proposed DVPT yields superior segmentation accuracy. Experimental results across three medical image segmentation tasks consistently demonstrate that our method outperforms current state-of-the-art approaches. Our method significantly contributes to accurate and impactful computer-assisted diagnostics, promoting advancements in healthcare solutions. Our code can be available at https://github.com/cuiruize/DVPT.
期刊介绍:
The purpose of the journal Computerized Medical Imaging and Graphics is to act as a source for the exchange of research results concerning algorithmic advances, development, and application of digital imaging in disease detection, diagnosis, intervention, prevention, precision medicine, and population health. Included in the journal will be articles on novel computerized imaging or visualization techniques, including artificial intelligence and machine learning, augmented reality for surgical planning and guidance, big biomedical data visualization, computer-aided diagnosis, computerized-robotic surgery, image-guided therapy, imaging scanning and reconstruction, mobile and tele-imaging, radiomics, and imaging integration and modeling with other information relevant to digital health. The types of biomedical imaging include: magnetic resonance, computed tomography, ultrasound, nuclear medicine, X-ray, microwave, optical and multi-photon microscopy, video and sensory imaging, and the convergence of biomedical images with other non-imaging datasets.