Adrien Meyer, Aditya Murali, Farahdiba Zarin, Didier Mutter, Nicolas Padoy
{"title":"Ultrasam: a foundation model for ultrasound using large open-access segmentation datasets.","authors":"Adrien Meyer, Aditya Murali, Farahdiba Zarin, Didier Mutter, Nicolas Padoy","doi":"10.1007/s11548-025-03517-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Automated ultrasound (US) image analysis remains a longstanding challenge due to anatomical complexity and the scarcity of annotated data. Although large-scale pretraining has improved data efficiency in many visual domains, its impact in US is limited by a pronounced domain shift from other imaging modalities and high variability across clinical applications, such as chest, ovarian, and endoscopic imaging. To address this, we propose UltraSam, a SAM-style model trained on a heterogeneous collection of publicly available segmentation datasets, originally developed in isolation. UltraSam is trained under the prompt-conditioned segmentation paradigm, which eliminates the need for unified labels and enables generalization to a broad range of downstream tasks.</p><p><strong>Methods: </strong>We compile US-43d, a large-scale collection of 43 open-access US datasets comprising over 282,000 images with segmentation masks covering 58 anatomical structures. We explore adaptation and fine-tuning strategies for SAM and systematically evaluate transferability across downstream tasks, comparing against state-of-the-art pretraining methods. We further propose prompted classification, a new use case where object-specific prompts and image features are jointly decoded to improve classification performance.</p><p><strong>Results: </strong>In experiments on three diverse public US datasets, UltraSam outperforms existing SAM variants on prompt-based segmentation and surpasses self-supervised US foundation models on downstream (prompted) classification and instance segmentation tasks.</p><p><strong>Conclusion: </strong>UltraSam demonstrates that SAM-style training on diverse, sparsely annotated US data enables effective generalization across tasks. By unlocking the value of fragmented public datasets, our approach lays the foundation for scalable, real-world US representation learning. We release our code and pretrained models at https://github.com/CAMMA-public/UltraSam and invite the community to further this effort by continuing to contribute high-quality datasets.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-025-03517-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Automated ultrasound (US) image analysis remains a longstanding challenge due to anatomical complexity and the scarcity of annotated data. Although large-scale pretraining has improved data efficiency in many visual domains, its impact in US is limited by a pronounced domain shift from other imaging modalities and high variability across clinical applications, such as chest, ovarian, and endoscopic imaging. To address this, we propose UltraSam, a SAM-style model trained on a heterogeneous collection of publicly available segmentation datasets, originally developed in isolation. UltraSam is trained under the prompt-conditioned segmentation paradigm, which eliminates the need for unified labels and enables generalization to a broad range of downstream tasks.
Methods: We compile US-43d, a large-scale collection of 43 open-access US datasets comprising over 282,000 images with segmentation masks covering 58 anatomical structures. We explore adaptation and fine-tuning strategies for SAM and systematically evaluate transferability across downstream tasks, comparing against state-of-the-art pretraining methods. We further propose prompted classification, a new use case where object-specific prompts and image features are jointly decoded to improve classification performance.
Results: In experiments on three diverse public US datasets, UltraSam outperforms existing SAM variants on prompt-based segmentation and surpasses self-supervised US foundation models on downstream (prompted) classification and instance segmentation tasks.
Conclusion: UltraSam demonstrates that SAM-style training on diverse, sparsely annotated US data enables effective generalization across tasks. By unlocking the value of fragmented public datasets, our approach lays the foundation for scalable, real-world US representation learning. We release our code and pretrained models at https://github.com/CAMMA-public/UltraSam and invite the community to further this effort by continuing to contribute high-quality datasets.
期刊介绍:
The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.