Lulin Yuan , Yifeng Zheng , Weiqiang Liu , Hong Zhao , Wenjie Zhang , Baoya Wei , Liming Chen
{"title":"SLDPC: Slide-Level Dual-Prompt Collaboration for few-shot whole slide image classification","authors":"Lulin Yuan , Yifeng Zheng , Weiqiang Liu , Hong Zhao , Wenjie Zhang , Baoya Wei , Liming Chen","doi":"10.1016/j.compmedimag.2026.102768","DOIUrl":null,"url":null,"abstract":"<div><div>Digital pathology standardizes diagnostic workflows through the digitization of conventional slides and the integration of algorithmic analysis. Few-shot Weakly Supervised Whole Slide Image (WSI) Classification (FSWC) represents a critical challenge in digital pathology. Conventional Multiple Instance Learning (MIL) methods rely on large volumes of annotated data and are susceptible to distribution shifts. Vision-Language Model (VLM)-based prompt learning methods enable parameter-efficient few-shot learning but are limited to patch-level feature aggregation, failing to model slide-level diagnostic information. As slide-level information is crucial for understanding tissue architecture and lesion distribution, we propose a Slide-Level Dual-Prompt Collaboration (SLDPC) framework for the FSWC task. Specifically, SLDPC leverages the representation learning capability of a slide-level VLM to perform prompt tuning directly at the slide level. A base prompt <span><math><mi>P</mi></math></span> is first obtained through continuous prompt initialization training and subsequently cloned to derive a parallel prompt <span><math><msup><mrow><mi>P</mi></mrow><mrow><mo>′</mo></mrow></msup></math></span>. In addition, bidirectional InfoNCE loss is employed to enhance feature-level alignment. During inference, a weighted fusion mechanism is introduced to combine both prompts and achieve efficient adaptation of slide-level multimodal representations. Experimental evaluation on four datasets validates the superiority of SLDPC. The results demonstrate that slide-level prompt learning effectively addresses FSWC challenges and improves model performance.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"131 ","pages":"Article 102768"},"PeriodicalIF":4.9000,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computerized Medical Imaging and Graphics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0895611126000716","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/4/21 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Digital pathology standardizes diagnostic workflows through the digitization of conventional slides and the integration of algorithmic analysis. Few-shot Weakly Supervised Whole Slide Image (WSI) Classification (FSWC) represents a critical challenge in digital pathology. Conventional Multiple Instance Learning (MIL) methods rely on large volumes of annotated data and are susceptible to distribution shifts. Vision-Language Model (VLM)-based prompt learning methods enable parameter-efficient few-shot learning but are limited to patch-level feature aggregation, failing to model slide-level diagnostic information. As slide-level information is crucial for understanding tissue architecture and lesion distribution, we propose a Slide-Level Dual-Prompt Collaboration (SLDPC) framework for the FSWC task. Specifically, SLDPC leverages the representation learning capability of a slide-level VLM to perform prompt tuning directly at the slide level. A base prompt is first obtained through continuous prompt initialization training and subsequently cloned to derive a parallel prompt . In addition, bidirectional InfoNCE loss is employed to enhance feature-level alignment. During inference, a weighted fusion mechanism is introduced to combine both prompts and achieve efficient adaptation of slide-level multimodal representations. Experimental evaluation on four datasets validates the superiority of SLDPC. The results demonstrate that slide-level prompt learning effectively addresses FSWC challenges and improves model performance.
期刊介绍:
The purpose of the journal Computerized Medical Imaging and Graphics is to act as a source for the exchange of research results concerning algorithmic advances, development, and application of digital imaging in disease detection, diagnosis, intervention, prevention, precision medicine, and population health. Included in the journal will be articles on novel computerized imaging or visualization techniques, including artificial intelligence and machine learning, augmented reality for surgical planning and guidance, big biomedical data visualization, computer-aided diagnosis, computerized-robotic surgery, image-guided therapy, imaging scanning and reconstruction, mobile and tele-imaging, radiomics, and imaging integration and modeling with other information relevant to digital health. The types of biomedical imaging include: magnetic resonance, computed tomography, ultrasound, nuclear medicine, X-ray, microwave, optical and multi-photon microscopy, video and sensory imaging, and the convergence of biomedical images with other non-imaging datasets.