PMC-LaMA：为医学建立开源语言模型。

Journal of the American Medical Informatics Association : JAMIA Pub Date : 2024-04-13 DOI:10.1093/jamia/ocae045

Chaoyi Wu, Weixiong Lin, Xiaoman Zhang, Ya Zhang, Weidi Xie, Yanfeng Wang

{"title":"PMC-LaMA：为医学建立开源语言模型。","authors":"Chaoyi Wu, Weixiong Lin, Xiaoman Zhang, Ya Zhang, Weidi Xie, Yanfeng Wang","doi":"10.1093/jamia/ocae045","DOIUrl":null,"url":null,"abstract":"OBJECTIVE\nRecently, large language models (LLMs) have showcased remarkable capabilities in natural language understanding. While demonstrating proficiency in everyday conversations and question-answering (QA) situations, these models frequently struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge. In this article, we describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.\n\n\nMATERIALS AND METHODS\nWe adapt a general-purpose LLM toward the medical domain, involving data-centric knowledge injection through the integration of 4.8M biomedical academic papers and 30K medical textbooks, as well as comprehensive domain-specific instruction fine-tuning, encompassing medical QA, rationale for reasoning, and conversational dialogues with 202M tokens.\n\n\nRESULTS\nWhile evaluating various public medical QA benchmarks and manual rating, our lightweight PMC-LLaMA, which consists of only 13B parameters, exhibits superior performance, even surpassing ChatGPT. All models, codes, and datasets for instruction tuning will be released to the research community.\n\n\nDISCUSSION\nOur contributions are 3-fold: (1) we build up an open-source LLM toward the medical domain. We believe the proposed PMC-LLaMA model can promote further development of foundation models in medicine, serving as a medical trainable basic generative language backbone; (2) we conduct thorough ablation studies to demonstrate the effectiveness of each proposed component, demonstrating how different training data and model scales affect medical LLMs; (3) we contribute a large-scale, comprehensive dataset for instruction tuning.\n\n\nCONCLUSION\nIn this article, we systematically investigate the process of building up an open-source medical-specific LLM, PMC-LLaMA.","PeriodicalId":236137,"journal":{"name":"Journal of the American Medical Informatics Association : JAMIA","volume":"13 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"PMC-LLaMA: toward building open-source language models for medicine.\",\"authors\":\"Chaoyi Wu, Weixiong Lin, Xiaoman Zhang, Ya Zhang, Weidi Xie, Yanfeng Wang\",\"doi\":\"10.1093/jamia/ocae045\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"OBJECTIVE\\nRecently, large language models (LLMs) have showcased remarkable capabilities in natural language understanding. While demonstrating proficiency in everyday conversations and question-answering (QA) situations, these models frequently struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge. In this article, we describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.\\n\\n\\nMATERIALS AND METHODS\\nWe adapt a general-purpose LLM toward the medical domain, involving data-centric knowledge injection through the integration of 4.8M biomedical academic papers and 30K medical textbooks, as well as comprehensive domain-specific instruction fine-tuning, encompassing medical QA, rationale for reasoning, and conversational dialogues with 202M tokens.\\n\\n\\nRESULTS\\nWhile evaluating various public medical QA benchmarks and manual rating, our lightweight PMC-LLaMA, which consists of only 13B parameters, exhibits superior performance, even surpassing ChatGPT. All models, codes, and datasets for instruction tuning will be released to the research community.\\n\\n\\nDISCUSSION\\nOur contributions are 3-fold: (1) we build up an open-source LLM toward the medical domain. We believe the proposed PMC-LLaMA model can promote further development of foundation models in medicine, serving as a medical trainable basic generative language backbone; (2) we conduct thorough ablation studies to demonstrate the effectiveness of each proposed component, demonstrating how different training data and model scales affect medical LLMs; (3) we contribute a large-scale, comprehensive dataset for instruction tuning.\\n\\n\\nCONCLUSION\\nIn this article, we systematically investigate the process of building up an open-source medical-specific LLM, PMC-LLaMA.\",\"PeriodicalId\":236137,\"journal\":{\"name\":\"Journal of the American Medical Informatics Association : JAMIA\",\"volume\":\"13 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the American Medical Informatics Association : JAMIA\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/jamia/ocae045\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association : JAMIA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jamia/ocae045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

目标最近，大型语言模型（LLM）在自然语言理解方面展现出了非凡的能力。虽然这些模型在日常对话和问题解答（QA）情况下表现出了熟练的能力，但由于缺乏特定领域的知识，它们在需要精确性的领域（如医疗应用）中往往举步维艰。在本文中，我们介绍了建立一个功能强大的开源语言模型的过程，该模型专为医学应用而设计，被称为 PMC-LaMA。结果在评估各种公共医疗质量保证基准和人工评级时，我们的轻量级 PMC-LaMA 仅由 13B 个参数组成，表现出卓越的性能，甚至超过了 ChatGPT。所有模型、代码和用于指导调整的数据集都将向研究界发布。我们的贡献有三个方面：（1）我们建立了面向医疗领域的开源 LLM。我们相信，所提出的 PMC-LaMA 模型可以促进医学基础模型的进一步发展，成为医学可训练的基础生成语言骨干；（2）我们进行了全面的消融研究，以证明所提出的每个组件的有效性，展示了不同的训练数据和模型规模对医学 LLM 的影响；（3）我们贡献了一个大规模、全面的数据集，用于指令调优。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

PMC-LLaMA: toward building open-source language models for medicine.

OBJECTIVE Recently, large language models (LLMs) have showcased remarkable capabilities in natural language understanding. While demonstrating proficiency in everyday conversations and question-answering (QA) situations, these models frequently struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge. In this article, we describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA. MATERIALS AND METHODS We adapt a general-purpose LLM toward the medical domain, involving data-centric knowledge injection through the integration of 4.8M biomedical academic papers and 30K medical textbooks, as well as comprehensive domain-specific instruction fine-tuning, encompassing medical QA, rationale for reasoning, and conversational dialogues with 202M tokens. RESULTS While evaluating various public medical QA benchmarks and manual rating, our lightweight PMC-LLaMA, which consists of only 13B parameters, exhibits superior performance, even surpassing ChatGPT. All models, codes, and datasets for instruction tuning will be released to the research community. DISCUSSION Our contributions are 3-fold: (1) we build up an open-source LLM toward the medical domain. We believe the proposed PMC-LLaMA model can promote further development of foundation models in medicine, serving as a medical trainable basic generative language backbone; (2) we conduct thorough ablation studies to demonstrate the effectiveness of each proposed component, demonstrating how different training data and model scales affect medical LLMs; (3) we contribute a large-scale, comprehensive dataset for instruction tuning. CONCLUSION In this article, we systematically investigate the process of building up an open-source medical-specific LLM, PMC-LLaMA.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of the American Medical Informatics Association : JAMIA

自引率

0.00%

发文量