特定领域的语言模型预训练和指令微调对生物医学关系提取有多重要？

Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info... Pub Date : 2026-01-01 Epub Date: 2025-07-01 DOI:10.1007/978-3-031-97141-9_6

Aviv Brokman, Ramakanth Kavuluru

{"title":"特定领域的语言模型预训练和指令微调对生物医学关系提取有多重要？","authors":"Aviv Brokman, Ramakanth Kavuluru","doi":"10.1007/978-3-031-97141-9_6","DOIUrl":null,"url":null,"abstract":"Major technical advances in the general NLP domain are often subsequently applied to the high-value, data-rich biomedical domain. The past few years have seen generative language models (LMs), instruction finetuning, and few-shot learning become foci of NLP research. As such, generative LMs pretrained on biomedical corpora have proliferated and biomedical instruction finetuning has been attempted as well, all with the hope that domain specificity improves performance on downstream tasks. Given the nontrivial effort in training such models, we investigate what, if any, benefits they have in the key biomedical NLP task of relation extraction. Specifically, we address two questions: (1) Do LMs trained on biomedical corpora outperform those trained on general domain corpora? (2) Do models instruction finetuned on biomedical datasets outperform those finetuned on assorted datasets or those simply pretrained? We tackle these questions using existing LMs, testing across four datasets. In a surprising result, general-domain models typically outperformed biomedical-domain models. However, biomedical instruction finetuning improved performance to a similar degree as general instruction finetuning, despite having orders of magnitude fewer instructions. Our findings suggest it may be more fruitful to focus research effort on larger-scale biomedical instruction finetuning of general LMs over building domain-specific biomedical LMs.","PeriodicalId":92107,"journal":{"name":"Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...","volume":"15836 ","pages":"80-94"},"PeriodicalIF":0.0000,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12367199/pdf/","citationCount":"0","resultStr":"{\"title\":\"How important is domain-specific language model pretraining and instruction finetuning for biomedical relation extraction?\",\"authors\":\"Aviv Brokman, Ramakanth Kavuluru\",\"doi\":\"10.1007/978-3-031-97141-9_6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Major technical advances in the general NLP domain are often subsequently applied to the high-value, data-rich biomedical domain. The past few years have seen generative language models (LMs), instruction finetuning, and few-shot learning become foci of NLP research. As such, generative LMs pretrained on biomedical corpora have proliferated and biomedical instruction finetuning has been attempted as well, all with the hope that domain specificity improves performance on downstream tasks. Given the nontrivial effort in training such models, we investigate what, if any, benefits they have in the key biomedical NLP task of relation extraction. Specifically, we address two questions: (1) Do LMs trained on biomedical corpora outperform those trained on general domain corpora? (2) Do models instruction finetuned on biomedical datasets outperform those finetuned on assorted datasets or those simply pretrained? We tackle these questions using existing LMs, testing across four datasets. In a surprising result, general-domain models typically outperformed biomedical-domain models. However, biomedical instruction finetuning improved performance to a similar degree as general instruction finetuning, despite having orders of magnitude fewer instructions. Our findings suggest it may be more fruitful to focus research effort on larger-scale biomedical instruction finetuning of general LMs over building domain-specific biomedical LMs.\",\"PeriodicalId\":92107,\"journal\":{\"name\":\"Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...\",\"volume\":\"15836 \",\"pages\":\"80-94\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2026-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12367199/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/978-3-031-97141-9_6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/7/1 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-031-97141-9_6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/1 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

一般自然语言处理领域的主要技术进步通常随后应用于高价值、数据丰富的生物医学领域。在过去的几年里，生成语言模型（LMs）、指令微调和少量学习成为NLP研究的焦点。因此，在生物医学语料库上预训练的生成式LMs已经激增，生物医学指令微调也已经尝试过，所有这些都希望领域特异性可以提高下游任务的性能。考虑到在训练这些模型方面的重要努力，我们调查了它们在关系提取的关键生物医学NLP任务中有什么好处，如果有的话。具体来说，我们解决了两个问题：(1)在生物医学语料库上训练的LMs是否优于在一般领域语料库上训练的LMs ？(2)在生物医学数据集上微调的模型指令是否优于在各种数据集上微调的模型指令或简单预训练的模型指令？我们使用现有的LMs解决这些问题，跨四个数据集进行测试。令人惊讶的结果是，一般领域模型通常优于生物医学领域模型。然而，生物医学指令微调提高性能的程度与一般指令微调相似，尽管有数量级较少的指令。我们的研究结果表明，将研究重点放在一般LMs的大规模生物医学教学微调上可能比构建特定领域的生物医学LMs更有成效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

How important is domain-specific language model pretraining and instruction finetuning for biomedical relation extraction?

Major technical advances in the general NLP domain are often subsequently applied to the high-value, data-rich biomedical domain. The past few years have seen generative language models (LMs), instruction finetuning, and few-shot learning become foci of NLP research. As such, generative LMs pretrained on biomedical corpora have proliferated and biomedical instruction finetuning has been attempted as well, all with the hope that domain specificity improves performance on downstream tasks. Given the nontrivial effort in training such models, we investigate what, if any, benefits they have in the key biomedical NLP task of relation extraction. Specifically, we address two questions: (1) Do LMs trained on biomedical corpora outperform those trained on general domain corpora? (2) Do models instruction finetuned on biomedical datasets outperform those finetuned on assorted datasets or those simply pretrained? We tackle these questions using existing LMs, testing across four datasets. In a surprising result, general-domain models typically outperformed biomedical-domain models. However, biomedical instruction finetuning improved performance to a similar degree as general instruction finetuning, despite having orders of magnitude fewer instructions. Our findings suggest it may be more fruitful to focus research effort on larger-scale biomedical instruction finetuning of general LMs over building domain-specific biomedical LMs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...

自引率

0.00%

发文量