Lemuel Puglisi , Daniel C. Alexander , Alzheimer’s Disease Neuroimaging Initiative , Australian Imaging Biomarkers and Lifestyle flagship study of aging , Daniele Ravì
{"title":"Brain Latent Progression: Individual-based spatiotemporal disease progression on 3D Brain MRIs via latent diffusion","authors":"Lemuel Puglisi , Daniel C. Alexander , Alzheimer’s Disease Neuroimaging Initiative , Australian Imaging Biomarkers and Lifestyle flagship study of aging , Daniele Ravì","doi":"10.1016/j.media.2025.103734","DOIUrl":"10.1016/j.media.2025.103734","url":null,"abstract":"<div><div>The growing availability of longitudinal Magnetic Resonance Imaging (MRI) datasets has facilitated Artificial Intelligence (AI)-driven modeling of disease progression, making it possible to predict future medical scans for individual patients. However, despite significant advancements in AI, current methods continue to face challenges including achieving patient-specific individualization, ensuring spatiotemporal consistency, efficiently utilizing longitudinal data, and managing the substantial memory demands of 3D scans. To address these challenges, we propose Brain Latent Progression (BrLP), a novel spatiotemporal model designed to predict individual-level disease progression in 3D brain MRIs. The key contributions in BrLP are fourfold: (i) it operates in a small latent space, mitigating the computational challenges posed by high-dimensional imaging data; (ii) it explicitly integrates subject metadata to enhance the individualization of predictions; (iii) it incorporates prior knowledge of disease dynamics through an auxiliary model, facilitating the integration of longitudinal data; and (iv) it introduces the Latent Average Stabilization (LAS) algorithm, which (a) enforces spatiotemporal consistency in the predicted progression at inference time and (b) allows us to derive a measure of the uncertainty for the prediction at the global and voxel level. We train and evaluate BrLP on 11,730 T1-weighted (T1w) brain MRIs from 2,805 subjects and validate its generalizability on an external test set comprising 2,257 MRIs from 962 subjects. Our experiments compare BrLP-generated MRI scans with real follow-up MRIs, demonstrating state-of-the-art accuracy compared to existing methods. The code is publicly available at: <span><span>https://github.com/LemuelPuglisi/BrLP</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103734"},"PeriodicalIF":11.8,"publicationDate":"2025-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144780233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huidong Xie , Weijie Gan , Wei Ji , Xiongchao Chen , Alaa Alashi , Stephanie L. Thorn , Bo Zhou , Qiong Liu , Menghua Xia , Xueqi Guo , Yi-Hwa Liu , Hongyu An , Ulugbek S. Kamilov , Ge Wang , Albert J. Sinusas , Chi Liu
{"title":"A generalizable diffusion framework for 3D low-dose and few-view cardiac SPECT imaging","authors":"Huidong Xie , Weijie Gan , Wei Ji , Xiongchao Chen , Alaa Alashi , Stephanie L. Thorn , Bo Zhou , Qiong Liu , Menghua Xia , Xueqi Guo , Yi-Hwa Liu , Hongyu An , Ulugbek S. Kamilov , Ge Wang , Albert J. Sinusas , Chi Liu","doi":"10.1016/j.media.2025.103729","DOIUrl":"10.1016/j.media.2025.103729","url":null,"abstract":"<div><div>Myocardial perfusion imaging using SPECT is widely utilized to diagnose coronary artery diseases, but image quality can be negatively affected in low-dose and few-view acquisition settings. Although various deep learning methods have been introduced to improve image quality from low-dose or few-view SPECT data, previous approaches often fail to generalize across different acquisition settings, limiting realistic applicability. This work introduced DiffSPECT-3D, a diffusion framework for 3D cardiac SPECT imaging that effectively adapts to different acquisition settings without requiring further network re-training or fine-tuning. Using both image and projection data, a consistency strategy is proposed to ensure that diffusion sampling at each step aligns with the low-dose/few-view projection measurements, the image data, and the scanner geometry, thus enabling generalization to different low-dose/few-view settings. Incorporating anatomical spatial information from CT and total variation constraint, we proposed a 2.5D conditional strategy to allow DiffSPECT-3D to observe 3D contextual information from the entire image volume, addressing the 3D memory/computational issues in diffusion model. We extensively evaluated the proposed method on 1,325 clinical <span><math><msup><mrow></mrow><mrow><mtext>99m</mtext></mrow></msup></math></span>Tc tetrofosmin stress/rest studies from 795 patients. Each study was reconstructed into 5 different low-count levels and 5 different projection few-view levels for model evaluations, ranging from 1% to 50% and from 1 view to 9 view, respectively. Validated against cardiac catheterization results and diagnostic review from nuclear cardiologists, the presented results show the potential to achieve low-dose and few-view SPECT imaging without compromising clinical performance. Additionally, DiffSPECT-3D could be directly applied to full-dose SPECT images to further improve image quality, especially in a low-dose stress-first cardiac SPECT imaging protocol.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103729"},"PeriodicalIF":11.8,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144768766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingquan Lin , Gregory Holste , Song Wang , Yiliang Zhou , Yishu Wei , Imon Banerjee , Pengyi Chen , Tianjie Dai , Yuexi Du , Nicha C. Dvornek , Yuyan Ge , Zuwei Guo , Shouhei Hanaoka , Dongkyun Kim , Pablo Messina , Yang Lu , Denis Parra , Donghyun Son , Álvaro Soto , Aisha Urooj , Yifan Peng
{"title":"CXR-LT 2024: A MICCAI challenge on long-tailed, multi-label, and zero-shot disease classification from chest X-ray","authors":"Mingquan Lin , Gregory Holste , Song Wang , Yiliang Zhou , Yishu Wei , Imon Banerjee , Pengyi Chen , Tianjie Dai , Yuexi Du , Nicha C. Dvornek , Yuyan Ge , Zuwei Guo , Shouhei Hanaoka , Dongkyun Kim , Pablo Messina , Yang Lu , Denis Parra , Donghyun Son , Álvaro Soto , Aisha Urooj , Yifan Peng","doi":"10.1016/j.media.2025.103739","DOIUrl":"10.1016/j.media.2025.103739","url":null,"abstract":"<div><div>The CXR-LT series is a community-driven initiative designed to enhance lung disease classification using chest X-rays (CXR). It tackles challenges in open long-tailed lung disease classification and enhances the measurability of state-of-the-art techniques. The first event, CXR-LT 2023, aimed to achieve these goals by providing high-quality benchmark CXR data for model development and conducting comprehensive evaluations to identify ongoing issues impacting lung disease classification performance. Building on the success of CXR-LT 2023, the <strong>CXR-LT 2024</strong> expands the dataset to 377,110 chest X-rays (CXRs) and 45 disease labels, including 19 new rare disease findings. It also introduces a new focus on zero-shot learning to address limitations identified in the previous event. Specifically, CXR-LT 2024 features three tasks: (i) long-tailed classification on a large, noisy test set, (ii) long-tailed classification on a manually annotated “gold standard” subset, and (iii) zero-shot generalization to five previously unseen disease findings. This paper provides an overview of CXR-LT 2024, detailing the data curation process and consolidating state-of-the-art solutions, including the use of multimodal models for rare disease detection, advanced generative approaches to handle noisy labels, and zero-shot learning strategies for unseen diseases. Additionally, the expanded dataset enhances disease coverage to better represent real-world clinical settings, offering a valuable resource for future research. By synthesizing the insights and innovations of participating teams, we aim to advance the development of clinically realistic and generalizable diagnostic models for chest radiography.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103739"},"PeriodicalIF":11.8,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144809816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Faceted Consistency learning with active cross-labeling for barely-supervised 3D medical image segmentation","authors":"Xinyao Wu , Zhe Xu , Raymond Kai-yu Tong","doi":"10.1016/j.media.2025.103744","DOIUrl":"10.1016/j.media.2025.103744","url":null,"abstract":"<div><div>Deep learning-driven 3D medical image segmentation generally necessitates dense voxel-wise annotations, which are expensive and labor-intensive to acquire. Cross-annotation, which labels only a few orthogonal slices per scan, has recently emerged as a cost-effective alternative that better preserves the shape and precise boundaries of the 3D object than traditional weak labeling methods such as bounding boxes and scribbles. However, learning from such sparse labels, referred to as barely-supervised learning (BSL), remains challenging due to less fine-grained object perception, less compact class features and inferior generalizability. To tackle these challenges and foster collaboration between model training and human expertise, we propose a Multi-Faceted ConSistency learning (MF-ConS) framework with a Diversity and Uncertainty Sampling-based Active Learning (DUS-AL) strategy, specifically designed for the active BSL scenario. This framework combines a cross-annotation BSL strategy, where only three orthogonal slices are labeled per scan, with an AL paradigm guided by DUS to direct human-in-the-loop annotation toward the most informative volumes under a fixed budget. Built upon a teacher–student architecture, MF-ConS integrates three complementary consistency regularization modules: (i) neighbor-informed object prediction consistency for advancing fine-grained object perception by encouraging the student model to infer complete segmentation from masked inputs; (ii) prototype-driven consistency, which enhances intra-class compactness and discriminativeness by aligning latent feature and decision spaces using fused prototypes; and (iii) stability constraint that promotes model robustness against input perturbations. Extensive experiments on three benchmark datasets demonstrate that MF-ConS (DUS-AL) consistently outperforms state-of-the-art methods under extremely limited annotation.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103744"},"PeriodicalIF":11.8,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144738202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinhee Kim , Taesung Kim , Taewoo Kim , Dong-Wook Kim , Byungduk Ahn , Yoon-Ji Kim , In-Seok Song , Jaegul Choo
{"title":"Attend-and-Refine: Interactive keypoint estimation and quantitative cervical vertebrae analysis for bone age assessment","authors":"Jinhee Kim , Taesung Kim , Taewoo Kim , Dong-Wook Kim , Byungduk Ahn , Yoon-Ji Kim , In-Seok Song , Jaegul Choo","doi":"10.1016/j.media.2025.103715","DOIUrl":"10.1016/j.media.2025.103715","url":null,"abstract":"<div><div>In pediatric orthodontics, accurate estimation of growth potential is essential for developing effective treatment strategies. Our research aims to predict this potential by identifying the growth peak and analyzing cervical vertebra morphology solely through lateral cephalometric radiographs. We accomplish this by comprehensively analyzing cervical vertebral maturation (CVM) features from these radiographs. This methodology provides clinicians with a reliable and efficient tool to determine the optimal timings for orthodontic interventions, ultimately enhancing patient outcomes. A crucial aspect of this approach is the meticulous annotation of keypoints on the cervical vertebrae, a task often challenged by its labor-intensive nature. To mitigate this, we introduce Attend-and-Refine Network (ARNet), a user-interactive, deep learning-based model designed to streamline the annotation process. ARNet features Interaction-guided recalibration network, which adaptively recalibrates image features in response to user feedback, coupled with a morphology-aware loss function that preserves the structural consistency of keypoints. This novel approach substantially reduces manual effort in keypoint identification, thereby enhancing the efficiency and accuracy of the process. Extensively validated across various datasets, ARNet demonstrates remarkable performance and exhibits wide-ranging applicability in medical imaging. In conclusion, our research offers an effective AI-assisted diagnostic tool for assessing growth potential in pediatric orthodontics, marking a significant advancement in the field.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103715"},"PeriodicalIF":11.8,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144780371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Continual learning in medical image analysis: A comprehensive review of recent advancements and future prospects","authors":"Pratibha Kumari , Joohi Chauhan , Afshin Bozorgpour , Boqiang Huang , Reza Azad , Dorit Merhof","doi":"10.1016/j.media.2025.103730","DOIUrl":"10.1016/j.media.2025.103730","url":null,"abstract":"<div><div>Medical image analysis has witnessed remarkable advancements, even surpassing human-level performance in recent years, driven by the rapid development of advanced deep-learning algorithms. However, when the inference dataset slightly differs from what the model has seen during one-time training, the model performance is greatly compromised. The situation requires restarting the training process using both the old and the new data, which is computationally costly, does not align with the human learning process, and imposes storage constraints and privacy concerns. Alternatively, continual learning has emerged as a crucial approach for developing unified and sustainable deep models to deal with new classes, tasks, and the drifting nature of data in non-stationary environments for various application areas. Continual learning techniques enable models to adapt and accumulate knowledge over time, which is essential for maintaining performance on evolving datasets and novel tasks. Owing to its popularity and promising performance, it is an active and emerging research topic in the medical field and hence demands a survey and taxonomy to clarify the current research landscape of continual learning in medical image analysis. This systematic review paper provides a comprehensive overview of the state-of-the-art in continual learning techniques applied to medical image analysis. We present an extensive survey of existing research, covering topics including catastrophic forgetting, data drifts, stability, and plasticity requirements. Further, an in-depth discussion of key components of a continual learning framework, such as continual learning scenarios, techniques, evaluation schemes, and metrics, is provided. Continual learning techniques encompass various categories, including rehearsal, regularization, architectural, and hybrid strategies. We assess the popularity and applicability of continual learning categories in various medical sub-fields like radiology and histopathology. Our exploration considers unique challenges in the medical domain, including costly data annotation, temporal drift, and the crucial need for benchmarking datasets to ensure consistent model evaluation. The paper also addresses current challenges and looks ahead to potential future research directions.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103730"},"PeriodicalIF":11.8,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144768767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pixel-wise recognition for holistic surgical scene understanding","authors":"Nicolás Ayobi , Santiago Rodríguez , Alejandra Pérez , Isabela Hernández , Nicolás Aparicio , Eugénie Dessevres , Sebastián Peña , Jessica Santander , Juan Ignacio Caicedo , Nicolás Fernández , Pablo Arbeláez","doi":"10.1016/j.media.2025.103726","DOIUrl":"10.1016/j.media.2025.103726","url":null,"abstract":"<div><div>This paper presents the Holistic and Multi-Granular Surgical Scene Understanding of Prostatectomies (GraSP) dataset, a curated benchmark that models surgical scene understanding as a hierarchy of complementary tasks with varying levels of granularity. Our approach encompasses long-term tasks, such as surgical phase and step recognition, and short-term tasks, including surgical instrument segmentation and atomic visual actions detection. To exploit our proposed benchmark, we introduce the Transformers for Actions, Phases, Steps, and Instrument Segmentation (TAPIS) model, a general architecture that combines a global video feature extractor with localized region proposals from an instrument segmentation model to tackle the multi-granularity of our benchmark. We demonstrate TAPIS’s versatility and state-of-the-art performance across different tasks through extensive experimentation on GraSP and alternative benchmarks. This work represents a foundational step forward in Endoscopic Vision, offering a novel framework for future research towards holistic surgical scene understanding.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103726"},"PeriodicalIF":11.8,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144799560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Synomaly noise and multi-stage diffusion: A novel approach for unsupervised anomaly detection in medical images","authors":"Yuan Bi , Lucie Huang , Ricarda Clarenbach , Reza Ghotbi , Angelos Karlas , Nassir Navab , Zhongliang Jiang","doi":"10.1016/j.media.2025.103737","DOIUrl":"10.1016/j.media.2025.103737","url":null,"abstract":"<div><div>Anomaly detection in medical imaging plays a crucial role in identifying pathological regions across various imaging modalities, such as brain MRI, liver CT, and carotid ultrasound (US). However, training fully supervised segmentation models is often hindered by the scarcity of expert annotations and the complexity of diverse anatomical structures. To address these issues, we propose a novel unsupervised anomaly detection framework based on a diffusion model that incorporates a synthetic anomaly (Synomaly) noise function and a multi-stage diffusion process. Synomaly noise introduces synthetic anomalies into healthy images during training, allowing the model to effectively learn anomaly removal. The multi-stage diffusion process is introduced to progressively denoise images, preserving fine details while improving the quality of anomaly-free reconstructions. The generated high-fidelity counterfactual healthy images can further enhance the interpretability of the segmentation models, as well as provide a reliable baseline for evaluating the extent of anomalies and supporting clinical decision-making. Notably, the unsupervised anomaly detection model is trained purely on healthy images, eliminating the need for anomalous training samples and pixel-level annotations. We validate the proposed approach on brain MRI, liver CT datasets, and carotid US. The experimental results demonstrate that the proposed framework outperforms existing state-of-the-art unsupervised anomaly detection methods, achieving performance comparable to fully supervised segmentation models in the US dataset. Ablation studies further highlight the contributions of Synomaly noise and the multi-stage diffusion process in improving anomaly segmentation. These findings underscore the potential of our approach as a robust and annotation-efficient alternative for medical anomaly detection. <strong>Code:</strong> <span><span>https://github.com/yuan-12138/Synomaly</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103737"},"PeriodicalIF":11.8,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144738204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vahid Ghodrati , Jinming Duan , Fadil Ali , Arash Bedayat , Ashley Prosper , Mark Bydder
{"title":"Accelerating cardiac radial-MRI: Fully polar based technique using compressed sensing and deep learning","authors":"Vahid Ghodrati , Jinming Duan , Fadil Ali , Arash Bedayat , Ashley Prosper , Mark Bydder","doi":"10.1016/j.media.2025.103732","DOIUrl":"10.1016/j.media.2025.103732","url":null,"abstract":"<div><div>Fast radial-MRI approaches based on compressed sensing (CS) and deep learning (DL) often use non-uniform fast Fourier transform (NUFFT) as the forward imaging operator, which might introduce interpolation errors and reduce image quality. Using the polar Fourier transform (PFT), we developed fully polar CS and DL algorithms for fast 2D cardiac radial-MRI. Our methods directly reconstruct images in polar spatial space from polar k-space data, eliminating frequency interpolation and ensuring an easy-to-compute data consistency term for the DL framework via the variable splitting (VS) scheme. Furthermore, PFT reconstruction produces initial images with fewer artifacts in a reduced field of view, making it a better starting point for CS and DL algorithms, especially for dynamic imaging, where information from a small region of interest is critical, as opposed to NUFFT, which often results in global streaking artifacts. In the cardiac region, PFT-based CS technique outperformed NUFFT-based CS at acceleration rates of 5x (mean SSIM: 0.8831 vs. 0.8526), 10x (0.8195 vs. 0.7981), and 15x (0.7720 vs. 0.7503). Our PFT(VS)-DL technique outperformed the NUFFT(GD)-based DL method, which used unrolled gradient descent with the NUFFT as the forward imaging operator, with mean SSIM scores of 0.8914 versus 0.8617 at 10x and 0.8470 versus 0.8301 at 15x. Radiological assessments revealed that PFT(VS)-based DL scored <span><math><mrow><mn>2</mn><mo>.</mo><mn>9</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>30</mn></mrow></math></span> and <span><math><mrow><mn>2</mn><mo>.</mo><mn>73</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>45</mn></mrow></math></span> at 5x and 10x, whereas NUFFT(GD)-based DL scored <span><math><mrow><mn>2</mn><mo>.</mo><mn>7</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>47</mn></mrow></math></span> and <span><math><mrow><mn>2</mn><mo>.</mo><mn>40</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>50</mn></mrow></math></span>, respectively. Our methods suggest a promising alternative to NUFFT-based fast radial-MRI for dynamic imaging, prioritizing reconstruction quality in a small region of interest over whole image quality.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103732"},"PeriodicalIF":11.8,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144738205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yichi Zhang , Bohao Lv , Le Xue , Wenbo Zhang , Yuchen Liu , Yu Fu , Yuan Cheng , Yuan Qi
{"title":"SemiSAM+: Rethinking semi-supervised medical image segmentation in the era of foundation models","authors":"Yichi Zhang , Bohao Lv , Le Xue , Wenbo Zhang , Yuchen Liu , Yu Fu , Yuan Cheng , Yuan Qi","doi":"10.1016/j.media.2025.103733","DOIUrl":"10.1016/j.media.2025.103733","url":null,"abstract":"<div><div>Deep learning-based medical image segmentation typically requires large amount of labeled data for training, making it less applicable in clinical settings due to high annotation cost. Semi-supervised learning (SSL) has emerged as an appealing strategy due to its less dependence on acquiring abundant annotations from experts compared to fully supervised methods. Beyond existing model-centric advancements of SSL by designing novel regularization strategies, we anticipate a paradigmatic shift due to the emergence of promptable segmentation foundation models with universal segmentation capabilities using positional prompts represented by Segment Anything Model (SAM). In this paper, we present <strong>SemiSAM+</strong>, a foundation model-driven SSL framework to efficiently learn from limited labeled data for medical image segmentation. SemiSAM+ consists of one or multiple promptable foundation models as <strong>generalist models</strong>, and a trainable task-specific segmentation model as <strong>specialist model</strong>. For a given new segmentation task, the training is based on the specialist–generalist collaborative learning procedure, where the trainable specialist model delivers positional prompts to interact with the frozen generalist models to acquire pseudo-labels, and then the generalist model output provides the specialist model with informative and efficient supervision which benefits the automatic segmentation and prompt generation in turn. Extensive experiments on three public datasets and one in-house clinical dataset demonstrate that SemiSAM+ achieves significant performance improvement, especially under extremely limited annotation scenarios, and shows strong efficiency as a plug-and-play strategy that can be easily adapted to different specialist and generalist models.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103733"},"PeriodicalIF":11.8,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144780368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}