Cost-effective instruction learning for pathology vision and language analysis

IF 18.3 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Kaitao Chen, Mianxin Liu, Fang Yan, Lei Ma, Xiaoming Shi, Lilong Wang, Xiaosong Wang, Lifeng Zhu, Zhe Wang, Mu Zhou, Shaoting Zhang
{"title":"Cost-effective instruction learning for pathology vision and language analysis","authors":"Kaitao Chen, Mianxin Liu, Fang Yan, Lei Ma, Xiaoming Shi, Lilong Wang, Xiaosong Wang, Lifeng Zhu, Zhe Wang, Mu Zhou, Shaoting Zhang","doi":"10.1038/s43588-025-00818-5","DOIUrl":null,"url":null,"abstract":"The advent of vision–language models fosters interactive conversations between artificial intelligence-enabled models and humans. However, applying these models in the clinic faces challenges related to large-scale training data as well as financial and computational resources. Here we propose CLOVER, a cost-effective instruction learning framework for conversational pathology. CLOVER trains a lightweight module and uses instruction tuning while freezing the parameters of the large language model. Instead of using costly GPT-4, we propose well-designed prompts on GPT-3.5 for building generation-based instructions, emphasizing the utility of pathological knowledge derived from the Internet source. We construct a high-quality set of template-based instructions in the context of digital pathology. Using two benchmark datasets, our findings reveal the strength of hybrid-form, pathological visual question–answer instructions. CLOVER outperforms baselines that possess 37 times more training parameters and exhibits few-shot capacity on an external clinical dataset. CLOVER could thus accelerate the adoption of rapid conversational applications in digital pathology. Training foundation models often requires a costly budget and excessive computational resources. In this study, a low-cost instruction learning framework is proposed that could enable the rapid adoption of visual-language pathology applications.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 7","pages":"524-533"},"PeriodicalIF":18.3000,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature computational science","FirstCategoryId":"1085","ListUrlMain":"https://www.nature.com/articles/s43588-025-00818-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

The advent of vision–language models fosters interactive conversations between artificial intelligence-enabled models and humans. However, applying these models in the clinic faces challenges related to large-scale training data as well as financial and computational resources. Here we propose CLOVER, a cost-effective instruction learning framework for conversational pathology. CLOVER trains a lightweight module and uses instruction tuning while freezing the parameters of the large language model. Instead of using costly GPT-4, we propose well-designed prompts on GPT-3.5 for building generation-based instructions, emphasizing the utility of pathological knowledge derived from the Internet source. We construct a high-quality set of template-based instructions in the context of digital pathology. Using two benchmark datasets, our findings reveal the strength of hybrid-form, pathological visual question–answer instructions. CLOVER outperforms baselines that possess 37 times more training parameters and exhibits few-shot capacity on an external clinical dataset. CLOVER could thus accelerate the adoption of rapid conversational applications in digital pathology. Training foundation models often requires a costly budget and excessive computational resources. In this study, a low-cost instruction learning framework is proposed that could enable the rapid adoption of visual-language pathology applications.

Abstract Image

具有成本效益的病理视觉与语言分析教学。
视觉语言模型的出现促进了人工智能模型与人类之间的互动对话。然而,将这些模型应用于临床面临着与大规模训练数据以及财务和计算资源相关的挑战。在这里,我们提出CLOVER,一个具有成本效益的会话病理学教学框架。CLOVER训练一个轻量级模块,并在冻结大型语言模型参数的同时使用指令调优。我们不使用昂贵的GPT-4,而是在GPT-3.5上提出精心设计的提示,用于构建基于生成的指令,强调从互联网来源获得的病理知识的效用。我们在数字病理学的背景下构建了一套高质量的基于模板的指令。使用两个基准数据集,我们的发现揭示了混合形式的力量,病理视觉问答指令。CLOVER优于基线,拥有37倍以上的训练参数,并在外部临床数据集上显示少量的射击能力。因此,CLOVER可以加速数字病理学中快速会话应用的采用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
11.70
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信