Clinical Prompt Learning With Frozen Language Models

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE transactions on neural networks and learning systems Pub Date : 2023-08-11 DOI:10.1109/TNNLS.2023.3294633

Niall Taylor;Yi Zhang;Dan W. Joyce;Ziming Gao;Andrey Kormilitzin;Alejo Nevado-Holgado

{"title":"Clinical Prompt Learning With Frozen Language Models","authors":"Niall Taylor;Yi Zhang;Dan W. Joyce;Ziming Gao;Andrey Kormilitzin;Alejo Nevado-Holgado","doi":"10.1109/TNNLS.2023.3294633","DOIUrl":null,"url":null,"abstract":"When the first transformer-based language models were published in the late 2010s, pretraining with general text and then fine-tuning the model on a task-specific dataset often achieved the state-of-the-art performance. However, more recent work suggests that for some tasks, directly prompting the pretrained model matches or surpasses fine-tuning in performance with few or no model parameter updates required. The use of prompts with language models for natural language processing (NLP) tasks is known as prompt learning. We investigated the viability of prompt learning on clinically meaningful decision tasks and directly compared this with more traditional fine-tuning methods. Results show that prompt learning methods were able to match or surpass the performance of traditional fine-tuning with up to 1000 times fewer trainable parameters, less training time, less training data, and lower computation resource requirements. We argue that these characteristics make prompt learning a very desirable alternative to traditional fine-tuning for clinical tasks, where the computational resources of public health providers are limited, and where data can often not be made available or not be used for fine-tuning due to patient privacy concerns. The complementary code to reproduce the experiments presented in this work can be found at \n<uri>https://github.com/NtaylorOX/Public_Clinical_Prompt</uri>\n.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"35 11","pages":"16453-16463"},"PeriodicalIF":8.9000,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10215061/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

When the first transformer-based language models were published in the late 2010s, pretraining with general text and then fine-tuning the model on a task-specific dataset often achieved the state-of-the-art performance. However, more recent work suggests that for some tasks, directly prompting the pretrained model matches or surpasses fine-tuning in performance with few or no model parameter updates required. The use of prompts with language models for natural language processing (NLP) tasks is known as prompt learning. We investigated the viability of prompt learning on clinically meaningful decision tasks and directly compared this with more traditional fine-tuning methods. Results show that prompt learning methods were able to match or surpass the performance of traditional fine-tuning with up to 1000 times fewer trainable parameters, less training time, less training data, and lower computation resource requirements. We argue that these characteristics make prompt learning a very desirable alternative to traditional fine-tuning for clinical tasks, where the computational resources of public health providers are limited, and where data can often not be made available or not be used for fine-tuning due to patient privacy concerns. The complementary code to reproduce the experiments presented in this work can be found at https://github.com/NtaylorOX/Public_Clinical_Prompt .

查看原文本刊更多论文

利用冰冻语言模型进行临床提示学习

在 2010 年代末发表第一批基于转换器的语言模型时，先用一般文本进行预训练，然后在特定任务的数据集上对模型进行微调往往能达到最先进的性能。然而，最近的研究表明，在某些任务中，直接提示预训练模型的性能可以达到或超过微调模型，而且只需少量或无需更新模型参数。在自然语言处理（NLP）任务中使用提示语言模型被称为提示学习。我们研究了提示学习在具有临床意义的决策任务中的可行性，并将其与更传统的微调方法进行了直接比较。结果表明，及时学习方法能够在可训练参数减少 1000 倍、训练时间更短、训练数据更少以及计算资源需求更低的情况下，达到或超过传统微调方法的性能。我们认为，在公共卫生服务提供商的计算资源有限，而且出于对患者隐私的考虑，往往无法提供或不能使用数据进行微调的临床任务中，这些特点使得及时学习成为传统微调方法的理想替代方案。重现本文实验的补充代码可在 https://github.com/NtaylorOX/Public_Clinical_Prompt 上找到。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.