Task-specific prompting SAM for multi-task gastric cancer diagnosis in endoscopic images

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-05-25 DOI:10.1016/j.eswa.2025.128329

Bilin Wang , Changda Lei , Yunbo Guo , Kaicheng Hong , Xiuji Kan , Yifan Ouyang , Junbo Li , Rui Li

{"title":"Task-specific prompting SAM for multi-task gastric cancer diagnosis in endoscopic images","authors":"Bilin Wang , Changda Lei , Yunbo Guo , Kaicheng Hong , Xiuji Kan , Yifan Ouyang , Junbo Li , Rui Li","doi":"10.1016/j.eswa.2025.128329","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning is increasingly applied in gastroscopic imaging to assist in lesion detection and diagnosis. However, less attention has been given to analyzing critical lesion characteristics, such as differentiation and invasion depth-factors essential for prognosis and treatment planning. The Segment Anything Model (SAM) introduces robust semantic feature extraction capabilities paired with flexible prompting mechanisms, enabling the model to focus on specific regions of interest. However, designing prompts that effectively capture the diverse semantic information needed for distinct tasks remains underexplored. In this paper, we propose Task-Specific Prompt SAM (TSP-SAM), an innovative framework that employs Task-Specific Prompt Generation (TSPG) and Embedding Prompt Fusion (EPF) to capture multi-perspective features for comprehensive classification and segmentation. Specifically, TSP-SAM utilizes SAM’s powerful feature extraction by integrating multi-scale embeddings and distilled prompt information. This allows the model to refine segmentation, staging, differentiation, and infiltration depth classification within a unified framework. We evaluate TSP-SAM on a newly developed gastroscopy dataset comprising over 3600 images with pixel-level and pathology-based annotations. Extensive experiments demonstrate that TSP-SAM consistently outperforms both traditional and advanced single-task models, showcasing its superior capability for joint optimization across multiple tasks.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"289 ","pages":"Article 128329"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425019487","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning is increasingly applied in gastroscopic imaging to assist in lesion detection and diagnosis. However, less attention has been given to analyzing critical lesion characteristics, such as differentiation and invasion depth-factors essential for prognosis and treatment planning. The Segment Anything Model (SAM) introduces robust semantic feature extraction capabilities paired with flexible prompting mechanisms, enabling the model to focus on specific regions of interest. However, designing prompts that effectively capture the diverse semantic information needed for distinct tasks remains underexplored. In this paper, we propose Task-Specific Prompt SAM (TSP-SAM), an innovative framework that employs Task-Specific Prompt Generation (TSPG) and Embedding Prompt Fusion (EPF) to capture multi-perspective features for comprehensive classification and segmentation. Specifically, TSP-SAM utilizes SAM’s powerful feature extraction by integrating multi-scale embeddings and distilled prompt information. This allows the model to refine segmentation, staging, differentiation, and infiltration depth classification within a unified framework. We evaluate TSP-SAM on a newly developed gastroscopy dataset comprising over 3600 images with pixel-level and pathology-based annotations. Extensive experiments demonstrate that TSP-SAM consistently outperforms both traditional and advanced single-task models, showcasing its superior capability for joint optimization across multiple tasks.

查看原文本刊更多论文

任务特异性提示SAM在内镜图像多任务胃癌诊断中的应用

深度学习越来越多地应用于胃镜成像，以辅助病变的检测和诊断。然而，对关键病变特征的分析，如分化和侵袭深度，对预后和治疗计划至关重要，却很少受到关注。分段任意模型（SAM）引入了健壮的语义特征提取功能和灵活的提示机制，使模型能够专注于感兴趣的特定区域。然而，设计能够有效捕获不同任务所需的各种语义信息的提示仍然有待探索。本文提出了任务特定提示SAM （TSP-SAM）框架，该框架采用了任务特定提示生成（TSPG）和嵌入提示融合（EPF）技术来捕获多视角特征，从而实现全面的分类和分割。具体来说，TSP-SAM通过整合多尺度嵌入和提取提示信息，利用SAM强大的特征提取功能。这使得模型可以在一个统一的框架内细化分割、分期、分化和渗透深度分类。我们在一个新开发的胃镜数据集上评估TSP-SAM，该数据集包含3600多张带有像素级和基于病理的注释的图像。大量实验表明，TSP-SAM始终优于传统和先进的单任务模型，展示了其跨多任务联合优化的卓越能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.