在医院员工调查中利用开源大型语言模型进行数据扩充：混合方法研究。

IF 3.2 Q1 EDUCATION, SCIENTIFIC DISCIPLINES

JMIR Medical Education Pub Date : 2024-11-19 DOI:10.2196/51433

Carl Ehrett, Sudeep Hegde, Kwame Andre, Dixizi Liu, Timothy Wilson

{"title":"在医院员工调查中利用开源大型语言模型进行数据扩充：混合方法研究。","authors":"Carl Ehrett, Sudeep Hegde, Kwame Andre, Dixizi Liu, Timothy Wilson","doi":"10.2196/51433","DOIUrl":null,"url":null,"abstract":"Background: Generative large language models (LLMs) have the potential to revolutionize medical education by generating tailored learning materials, enhancing teaching efficiency, and improving learner engagement. However, the application of LLMs in health care settings, particularly for augmenting small datasets in text classification tasks, remains underexplored, particularly for cost- and privacy-conscious applications that do not permit the use of third-party services such as OpenAI's ChatGPT.Objective: This study aims to explore the use of open-source LLMs, such as Large Language Model Meta AI (LLaMA) and Alpaca models, for data augmentation in a specific text classification task related to hospital staff surveys.Methods: The surveys were designed to elicit narratives of everyday adaptation by frontline radiology staff during the initial phase of the COVID-19 pandemic. A 2-step process of data augmentation and text classification was conducted. The study generated synthetic data similar to the survey reports using 4 generative LLMs for data augmentation. A different set of 3 classifier LLMs was then used to classify the augmented text for thematic categories. The study evaluated performance on the classification task.Results: The overall best-performing combination of LLMs, temperature, classifier, and number of synthetic data cases is via augmentation with LLaMA 7B at temperature 0.7 with 100 augments, using Robustly Optimized BERT Pretraining Approach (RoBERTa) for the classification task, achieving an average area under the receiver operating characteristic (AUC) curve of 0.87 (SD 0.02; ie, 1 SD). The results demonstrate that open-source LLMs can enhance text classifiers' performance for small datasets in health care contexts, providing promising pathways for improving medical education processes and patient care practices.Conclusions: The study demonstrates the value of data augmentation with open-source LLMs, highlights the importance of privacy and ethical considerations when using LLMs, and suggests future directions for research in this field.","PeriodicalId":36236,"journal":{"name":"JMIR Medical Education","volume":"10 ","pages":"e51433"},"PeriodicalIF":3.2000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11590755/pdf/","citationCount":"0","resultStr":"{\"title\":\"Leveraging Open-Source Large Language Models for Data Augmentation in Hospital Staff Surveys: Mixed Methods Study.\",\"authors\":\"Carl Ehrett, Sudeep Hegde, Kwame Andre, Dixizi Liu, Timothy Wilson\",\"doi\":\"10.2196/51433\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Generative large language models (LLMs) have the potential to revolutionize medical education by generating tailored learning materials, enhancing teaching efficiency, and improving learner engagement. However, the application of LLMs in health care settings, particularly for augmenting small datasets in text classification tasks, remains underexplored, particularly for cost- and privacy-conscious applications that do not permit the use of third-party services such as OpenAI's ChatGPT.Objective: This study aims to explore the use of open-source LLMs, such as Large Language Model Meta AI (LLaMA) and Alpaca models, for data augmentation in a specific text classification task related to hospital staff surveys.Methods: The surveys were designed to elicit narratives of everyday adaptation by frontline radiology staff during the initial phase of the COVID-19 pandemic. A 2-step process of data augmentation and text classification was conducted. The study generated synthetic data similar to the survey reports using 4 generative LLMs for data augmentation. A different set of 3 classifier LLMs was then used to classify the augmented text for thematic categories. The study evaluated performance on the classification task.Results: The overall best-performing combination of LLMs, temperature, classifier, and number of synthetic data cases is via augmentation with LLaMA 7B at temperature 0.7 with 100 augments, using Robustly Optimized BERT Pretraining Approach (RoBERTa) for the classification task, achieving an average area under the receiver operating characteristic (AUC) curve of 0.87 (SD 0.02; ie, 1 SD). The results demonstrate that open-source LLMs can enhance text classifiers' performance for small datasets in health care contexts, providing promising pathways for improving medical education processes and patient care practices.Conclusions: The study demonstrates the value of data augmentation with open-source LLMs, highlights the importance of privacy and ethical considerations when using LLMs, and suggests future directions for research in this field.\",\"PeriodicalId\":36236,\"journal\":{\"name\":\"JMIR Medical Education\",\"volume\":\"10 \",\"pages\":\"e51433\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11590755/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR Medical Education\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/51433\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION, SCIENTIFIC DISCIPLINES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/51433","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}

引用次数: 0

摘要

背景：生成式大型语言模型（LLMs）可生成量身定制的学习材料、提高教学效率并改善学习者的参与度，从而有望彻底改变医学教育。然而，LLMs 在医疗环境中的应用，尤其是在文本分类任务中用于扩充小型数据集的应用，仍未得到充分探索，特别是在成本和隐私意识较高的应用中，因为这些应用不允许使用第三方服务，如 OpenAI 的 ChatGPT：本研究旨在探索开源 LLM（如大型语言模型元人工智能（LLaMA）和 Alpaca 模型）在与医院员工调查相关的特定文本分类任务中的数据增强应用：调查旨在了解一线放射科工作人员在 COVID-19 大流行初期的日常适应情况。研究采用了数据扩充和文本分类两个步骤。研究使用 4 个生成式 LLM 生成与调查报告类似的合成数据，用于数据扩增。然后使用一组不同的 3 个分类器 LLM 对增强文本进行主题分类。研究评估了分类任务的性能：LLMs、温度、分类器和合成数据个数的最佳组合是在温度为 0.7 的条件下使用 LLaMA 7B 进行扩增，并使用稳健优化的 BERT 预训练方法 (RoBERTa) 进行 100 个扩增，从而完成分类任务，其接收者操作特征曲线下的平均面积 (AUC) 为 0.87（SD 0.02；即 1 SD）。研究结果表明，开源 LLM 可以提高医疗保健领域小型数据集文本分类器的性能，为改善医学教育流程和患者护理实践提供了前景广阔的途径：本研究证明了使用开源 LLMs 增强数据的价值，强调了使用 LLMs 时隐私和伦理考虑的重要性，并提出了该领域未来的研究方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Leveraging Open-Source Large Language Models for Data Augmentation in Hospital Staff Surveys: Mixed Methods Study.

Background: Generative large language models (LLMs) have the potential to revolutionize medical education by generating tailored learning materials, enhancing teaching efficiency, and improving learner engagement. However, the application of LLMs in health care settings, particularly for augmenting small datasets in text classification tasks, remains underexplored, particularly for cost- and privacy-conscious applications that do not permit the use of third-party services such as OpenAI's ChatGPT.

Objective: This study aims to explore the use of open-source LLMs, such as Large Language Model Meta AI (LLaMA) and Alpaca models, for data augmentation in a specific text classification task related to hospital staff surveys.

Methods: The surveys were designed to elicit narratives of everyday adaptation by frontline radiology staff during the initial phase of the COVID-19 pandemic. A 2-step process of data augmentation and text classification was conducted. The study generated synthetic data similar to the survey reports using 4 generative LLMs for data augmentation. A different set of 3 classifier LLMs was then used to classify the augmented text for thematic categories. The study evaluated performance on the classification task.

Results: The overall best-performing combination of LLMs, temperature, classifier, and number of synthetic data cases is via augmentation with LLaMA 7B at temperature 0.7 with 100 augments, using Robustly Optimized BERT Pretraining Approach (RoBERTa) for the classification task, achieving an average area under the receiver operating characteristic (AUC) curve of 0.87 (SD 0.02; ie, 1 SD). The results demonstrate that open-source LLMs can enhance text classifiers' performance for small datasets in health care contexts, providing promising pathways for improving medical education processes and patient care practices.

Conclusions: The study demonstrates the value of data augmentation with open-source LLMs, highlights the importance of privacy and ethical considerations when using LLMs, and suggests future directions for research in this field.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

JMIR Medical Education Social Sciences-Education

CiteScore

6.90

自引率

5.60%

发文量

审稿时长

8 weeks