规模效率:研究微型语言模型在临床任务中的表现。

IF 6.1 2区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Niall Taylor , Upamanyu Ghose , Omid Rohanian , Mohammadmahdi Nouriborji , Andrey Kormilitzin , David A. Clifton , Alejo Nevado-Holgado
{"title":"规模效率:研究微型语言模型在临床任务中的表现。","authors":"Niall Taylor ,&nbsp;Upamanyu Ghose ,&nbsp;Omid Rohanian ,&nbsp;Mohammadmahdi Nouriborji ,&nbsp;Andrey Kormilitzin ,&nbsp;David A. Clifton ,&nbsp;Alejo Nevado-Holgado","doi":"10.1016/j.artmed.2024.103002","DOIUrl":null,"url":null,"abstract":"<div><div>The entry of large language models (LLMs) into research and commercial spaces has led to a trend of ever-larger models, with initial promises of generalisability. This was followed by a widespread desire to downsize and create specialised models without the need for complete fine-tuning, using Parameter Efficient Fine-tuning (PEFT) methods. We present an investigation into the suitability of different PEFT methods to clinical decision-making tasks, across a range of model sizes, including extremely small models with as few as 25 million parameters.</div><div>Our analysis shows that the performance of most PEFT approaches varies significantly from one task to another, with the exception of LoRA, which maintains relatively high performance across all model sizes and tasks, typically approaching or matching full fine-tuned performance. The effectiveness of PEFT methods in the clinical domain is evident, particularly for specialised models which can operate on low-cost, in-house computing infrastructure. The advantages of these models, in terms of speed and reduced training costs, dramatically outweighs any performance gain from large foundation LLMs. Furthermore, we highlight how domain-specific pre-training interacts with PEFT methods and model size, finding the domain pre-training to be particularly important in smaller models and discuss how these factors interplay to provide the best efficiency-performance trade-off. Full code available at: <span><span>https://github.com/nlpie-research/efficient-ml</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"157 ","pages":"Article 103002"},"PeriodicalIF":6.1000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficiency at scale: Investigating the performance of diminutive language models in clinical tasks\",\"authors\":\"Niall Taylor ,&nbsp;Upamanyu Ghose ,&nbsp;Omid Rohanian ,&nbsp;Mohammadmahdi Nouriborji ,&nbsp;Andrey Kormilitzin ,&nbsp;David A. Clifton ,&nbsp;Alejo Nevado-Holgado\",\"doi\":\"10.1016/j.artmed.2024.103002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The entry of large language models (LLMs) into research and commercial spaces has led to a trend of ever-larger models, with initial promises of generalisability. This was followed by a widespread desire to downsize and create specialised models without the need for complete fine-tuning, using Parameter Efficient Fine-tuning (PEFT) methods. We present an investigation into the suitability of different PEFT methods to clinical decision-making tasks, across a range of model sizes, including extremely small models with as few as 25 million parameters.</div><div>Our analysis shows that the performance of most PEFT approaches varies significantly from one task to another, with the exception of LoRA, which maintains relatively high performance across all model sizes and tasks, typically approaching or matching full fine-tuned performance. The effectiveness of PEFT methods in the clinical domain is evident, particularly for specialised models which can operate on low-cost, in-house computing infrastructure. The advantages of these models, in terms of speed and reduced training costs, dramatically outweighs any performance gain from large foundation LLMs. Furthermore, we highlight how domain-specific pre-training interacts with PEFT methods and model size, finding the domain pre-training to be particularly important in smaller models and discuss how these factors interplay to provide the best efficiency-performance trade-off. Full code available at: <span><span>https://github.com/nlpie-research/efficient-ml</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":55458,\"journal\":{\"name\":\"Artificial Intelligence in Medicine\",\"volume\":\"157 \",\"pages\":\"Article 103002\"},\"PeriodicalIF\":6.1000,\"publicationDate\":\"2024-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence in Medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0933365724002446\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0933365724002446","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

随着大型语言模型(LLM)进入研究和商业领域,最初承诺具有通用性的模型呈现出越来越大的趋势。随后,人们普遍希望使用参数高效微调(PEFT)方法缩小并创建无需完全微调的专用模型。我们对不同的 PEFT 方法是否适用于临床决策任务进行了调查,调查涉及各种规模的模型,包括参数少至 2,500 万的极小模型。我们的分析表明,大多数 PEFT 方法的性能在不同任务中差异很大,但 LoRA 除外,它在所有模型大小和任务中都能保持相对较高的性能,通常接近或匹配完全微调性能。PEFT 方法在临床领域的有效性是显而易见的,特别是对于可在低成本内部计算基础设施上运行的专业模型。这些模型在速度和降低训练成本方面的优势,大大超过了大型基础 LLM 所带来的性能提升。此外,我们还强调了特定领域的预训练如何与 PEFT 方法和模型大小相互作用,发现领域预训练对较小的模型尤为重要,并讨论了这些因素如何相互作用以提供最佳的效率-性能权衡。完整代码见:https://github.com/nlpie-research/efficient-ml。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Efficiency at scale: Investigating the performance of diminutive language models in clinical tasks
The entry of large language models (LLMs) into research and commercial spaces has led to a trend of ever-larger models, with initial promises of generalisability. This was followed by a widespread desire to downsize and create specialised models without the need for complete fine-tuning, using Parameter Efficient Fine-tuning (PEFT) methods. We present an investigation into the suitability of different PEFT methods to clinical decision-making tasks, across a range of model sizes, including extremely small models with as few as 25 million parameters.
Our analysis shows that the performance of most PEFT approaches varies significantly from one task to another, with the exception of LoRA, which maintains relatively high performance across all model sizes and tasks, typically approaching or matching full fine-tuned performance. The effectiveness of PEFT methods in the clinical domain is evident, particularly for specialised models which can operate on low-cost, in-house computing infrastructure. The advantages of these models, in terms of speed and reduced training costs, dramatically outweighs any performance gain from large foundation LLMs. Furthermore, we highlight how domain-specific pre-training interacts with PEFT methods and model size, finding the domain pre-training to be particularly important in smaller models and discuss how these factors interplay to provide the best efficiency-performance trade-off. Full code available at: https://github.com/nlpie-research/efficient-ml.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Artificial Intelligence in Medicine
Artificial Intelligence in Medicine 工程技术-工程:生物医学
CiteScore
15.00
自引率
2.70%
发文量
143
审稿时长
6.3 months
期刊介绍: Artificial Intelligence in Medicine publishes original articles from a wide variety of interdisciplinary perspectives concerning the theory and practice of artificial intelligence (AI) in medicine, medically-oriented human biology, and health care. Artificial intelligence in medicine may be characterized as the scientific discipline pertaining to research studies, projects, and applications that aim at supporting decision-based medical tasks through knowledge- and/or data-intensive computer-based solutions that ultimately support and improve the performance of a human care provider.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信