{"title":"医学与大语言模型研究的必然转变:可能性与陷阱","authors":"Yuanxu Gao, Daniel T. Baptista-Hon, Kang Zhang","doi":"10.1002/mef2.49","DOIUrl":null,"url":null,"abstract":"<p>Large language models (LLMs) often refer to artificial intelligence models that consist of extensive parameters and have the ability to understand and generate human-like language. They are typically developed in a self-supervised learning manner and are trained on large quantities of unlabeled text to learn patterns in language. LLMs were initially used in natural language processing (NLP), but they have since been extended to a variety of tasks like processing biological sequences and combining text with other modalities of data. LLMs have the potential to revolutionize the way we approach scientific research and medicine. For example, by leveraging their ability to understand and interpret vast quantities of text data, LLMs can provide insights and make predictions that would otherwise be impossible.</p><p>In the medical domain, LLMs can be used to analyze immense electronic health records and improve communication between healthcare professionals and patients. For example, LLMs can be used to automate triage, medical coding, and clinical documentation, which can help to improve the accuracy and efficiency of these processes. They can also be used to improve NLP in medical chatbots and virtual assistants, allowing patients to interact with healthcare services more efficiently and effectively. They can also be used to process medical records and patient data, enabling better diagnoses and more personalized treatments. They can also be used to analyze clinical trial data and identify trends that could lead to better outcomes. Finally, LLMs can also be used to answer medical questions and provide guidance to healthcare professionals, which can help to improve the quality of care. In the accompanying Review, Zheng et al.<span><sup>1</sup></span> undertake a major effort to write a comprehensive review of this exciting and highly evolving field.</p><p>In research, LLMs can be used to search through diverse large datasets and identify patterns that would otherwise be difficult to detect. They can also be used to generate and test hypotheses and to summarize and analyze research papers. It is clear that LLMs will be transforming the way we communicate about medicine and research, and have the potential to revolutionize the field of healthcare.</p><p>The current state-of-the-art LLM is Generative Pre-trained Transformer 4 (GPT-4), developed by OpenAI, about which Technical details have not been made public yet.<span><sup>2</sup></span> Based on publicly available information, the number of parameters is comparable to its previous generation, GPT-3, which consists of 175 billion parameters. GPT-4 is a generative model, meaning it can generate human-like language and even create original content. Other notable LLMs include GPT-3, Bidirectional Encoder Representations from Transformers, and Text-to-Text Transfer Transformers, each with its unique strengths and capabilities. However, one example of an LLM developed specifically for the medical domain is GatorTron,<span><sup>3</sup></span> which can process and interpret electronic health records. GatorTron was developed by a team of researchers from the University of Florida. The model is trained on >90 billion words of text, including >82 billion words of deidentified clinical text. GatorTron achieves good performance on five clinical NLP tasks, including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference, and medical question answering. Besides, the results show that scaling up the number of parameters and the size of the training data can significantly improve the performance of these clinical NLP tasks. GatorTron's ability to accurately process unstructured clinical text can enhance medical AI systems and improve healthcare delivery. GatorTron is an example of the potential of LLMs to be tailored to specific domains or industries, allowing for more accurate and efficient language processing in specialized fields.</p><p>Despite the many potential benefits of LLMs in medicine and research, there are also risks and concerns. LLMs could be exploited for spreading false information or manipulating public opinion, such as during global health crises. LLMs are also fundamentally trained on all available information or data, including inaccuracies and biases. These inaccuracies and biases can be reflected in the output of hallucination, which refers to mistakes in the generated text that are semantically or syntactically plausible but are in fact incorrect or nonsensical. There are also privacy concerns with LLMs because they can potentially access and process sensitive personal data.</p><p>It is ultimately difficult to hold LLMs accountable for their outputs. Therefore, the accountability ultimately rests on the user. Human oversight and governance of LLM outputs, especially in medicine and research, is paramount. The implementation of LLMs in healthcare has to be subjected to the same rigor and standards as any other new interventions through clinical trials, to demonstrate that the application of LLMs is at least noninferior to current approaches. Ultimately, the use of LLMs in medicine and research requires a shared responsibility among all stakeholders, including researchers, technology companies, regulatory bodies, and society as a whole.</p><p>The power and potential of LLMs mean that it is here to stay, and its widespread implementation is inevitable. Recognition of its potential and ethical implementation is essential to ensure that they are used responsibly and for the benefit of all.</p><p>Yuanxu Gao, Daniel T. Baptista-Hon, and Kang Zhang wrote the manuscript. All authors have read and approved the final manuscript.</p><p>The authors declare no conflict of interest.</p>","PeriodicalId":74135,"journal":{"name":"MedComm - Future medicine","volume":"2 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/mef2.49","citationCount":"1","resultStr":"{\"title\":\"The inevitable transformation of medicine and research by large language models: The possibilities and pitfalls\",\"authors\":\"Yuanxu Gao, Daniel T. Baptista-Hon, Kang Zhang\",\"doi\":\"10.1002/mef2.49\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Large language models (LLMs) often refer to artificial intelligence models that consist of extensive parameters and have the ability to understand and generate human-like language. They are typically developed in a self-supervised learning manner and are trained on large quantities of unlabeled text to learn patterns in language. LLMs were initially used in natural language processing (NLP), but they have since been extended to a variety of tasks like processing biological sequences and combining text with other modalities of data. LLMs have the potential to revolutionize the way we approach scientific research and medicine. For example, by leveraging their ability to understand and interpret vast quantities of text data, LLMs can provide insights and make predictions that would otherwise be impossible.</p><p>In the medical domain, LLMs can be used to analyze immense electronic health records and improve communication between healthcare professionals and patients. For example, LLMs can be used to automate triage, medical coding, and clinical documentation, which can help to improve the accuracy and efficiency of these processes. They can also be used to improve NLP in medical chatbots and virtual assistants, allowing patients to interact with healthcare services more efficiently and effectively. They can also be used to process medical records and patient data, enabling better diagnoses and more personalized treatments. They can also be used to analyze clinical trial data and identify trends that could lead to better outcomes. Finally, LLMs can also be used to answer medical questions and provide guidance to healthcare professionals, which can help to improve the quality of care. In the accompanying Review, Zheng et al.<span><sup>1</sup></span> undertake a major effort to write a comprehensive review of this exciting and highly evolving field.</p><p>In research, LLMs can be used to search through diverse large datasets and identify patterns that would otherwise be difficult to detect. They can also be used to generate and test hypotheses and to summarize and analyze research papers. It is clear that LLMs will be transforming the way we communicate about medicine and research, and have the potential to revolutionize the field of healthcare.</p><p>The current state-of-the-art LLM is Generative Pre-trained Transformer 4 (GPT-4), developed by OpenAI, about which Technical details have not been made public yet.<span><sup>2</sup></span> Based on publicly available information, the number of parameters is comparable to its previous generation, GPT-3, which consists of 175 billion parameters. GPT-4 is a generative model, meaning it can generate human-like language and even create original content. Other notable LLMs include GPT-3, Bidirectional Encoder Representations from Transformers, and Text-to-Text Transfer Transformers, each with its unique strengths and capabilities. However, one example of an LLM developed specifically for the medical domain is GatorTron,<span><sup>3</sup></span> which can process and interpret electronic health records. GatorTron was developed by a team of researchers from the University of Florida. The model is trained on >90 billion words of text, including >82 billion words of deidentified clinical text. GatorTron achieves good performance on five clinical NLP tasks, including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference, and medical question answering. Besides, the results show that scaling up the number of parameters and the size of the training data can significantly improve the performance of these clinical NLP tasks. GatorTron's ability to accurately process unstructured clinical text can enhance medical AI systems and improve healthcare delivery. GatorTron is an example of the potential of LLMs to be tailored to specific domains or industries, allowing for more accurate and efficient language processing in specialized fields.</p><p>Despite the many potential benefits of LLMs in medicine and research, there are also risks and concerns. LLMs could be exploited for spreading false information or manipulating public opinion, such as during global health crises. LLMs are also fundamentally trained on all available information or data, including inaccuracies and biases. These inaccuracies and biases can be reflected in the output of hallucination, which refers to mistakes in the generated text that are semantically or syntactically plausible but are in fact incorrect or nonsensical. There are also privacy concerns with LLMs because they can potentially access and process sensitive personal data.</p><p>It is ultimately difficult to hold LLMs accountable for their outputs. Therefore, the accountability ultimately rests on the user. Human oversight and governance of LLM outputs, especially in medicine and research, is paramount. The implementation of LLMs in healthcare has to be subjected to the same rigor and standards as any other new interventions through clinical trials, to demonstrate that the application of LLMs is at least noninferior to current approaches. Ultimately, the use of LLMs in medicine and research requires a shared responsibility among all stakeholders, including researchers, technology companies, regulatory bodies, and society as a whole.</p><p>The power and potential of LLMs mean that it is here to stay, and its widespread implementation is inevitable. Recognition of its potential and ethical implementation is essential to ensure that they are used responsibly and for the benefit of all.</p><p>Yuanxu Gao, Daniel T. Baptista-Hon, and Kang Zhang wrote the manuscript. All authors have read and approved the final manuscript.</p><p>The authors declare no conflict of interest.</p>\",\"PeriodicalId\":74135,\"journal\":{\"name\":\"MedComm - Future medicine\",\"volume\":\"2 2\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/mef2.49\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MedComm - Future medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/mef2.49\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MedComm - Future medicine","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/mef2.49","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
大型语言模型(llm)通常是指由大量参数组成的人工智能模型,具有理解和生成类人语言的能力。它们通常以自我监督的学习方式发展,并在大量未标记的文本上进行训练,以学习语言模式。llm最初用于自然语言处理(NLP),但它们已经扩展到各种任务,如处理生物序列和将文本与其他形式的数据相结合。法学硕士有可能彻底改变我们从事科学研究和医学的方式。例如,通过利用他们理解和解释大量文本数据的能力,法学硕士可以提供洞察力并做出预测,否则这是不可能的。在医学领域,法学硕士可以用来分析大量的电子健康记录,并改善医疗保健专业人员和患者之间的沟通。例如,llm可用于自动分类、医疗编码和临床文档,这有助于提高这些过程的准确性和效率。它们还可以用于改进医疗聊天机器人和虚拟助手中的NLP,使患者能够更高效地与医疗服务进行互动。它们还可以用于处理医疗记录和患者数据,从而实现更好的诊断和更个性化的治疗。它们还可以用于分析临床试验数据,并确定可能导致更好结果的趋势。最后,法学硕士还可以用来回答医学问题,并为医疗保健专业人员提供指导,这有助于提高护理质量。在随附的综述中,郑等人1承担了主要的工作,对这一令人兴奋和高度发展的领域进行了全面的综述。在研究中,法学硕士可用于搜索不同的大型数据集,并识别难以检测的模式。它们也可以用来产生和检验假设,总结和分析研究论文。很明显,法学硕士将改变我们关于医学和研究的交流方式,并有可能彻底改变医疗保健领域。目前最先进的LLM是由OpenAI开发的生成预训练变压器4 (GPT-4),有关其技术细节尚未公开根据公开信息,参数的数量与上一代GPT-3相当,后者由1750亿个参数组成。GPT-4是一个生成模型,这意味着它可以生成类似人类的语言,甚至可以创建原创内容。其他著名的llm包括GPT-3,双向编码器表示从变压器,和文本到文本传输变压器,每一个都有其独特的优势和能力。然而,专门为医疗领域开发的法学硕士的一个例子是GatorTron,它可以处理和解释电子健康记录。GatorTron是由佛罗里达大学的一组研究人员开发的。该模型在900亿字的文本上进行训练,其中包括820亿字的未识别临床文本。GatorTron在临床概念提取、医学关系提取、语义文本相似度、自然语言推理、医学问答等5个临床NLP任务上均取得了较好的表现。此外,结果表明,扩大参数数量和训练数据的大小可以显著提高这些临床NLP任务的性能。GatorTron准确处理非结构化临床文本的能力可以增强医疗人工智能系统并改善医疗服务。GatorTron是llm为特定领域或行业量身定制的潜力的一个例子,允许在专业领域进行更准确和有效的语言处理。尽管法学硕士在医学和研究方面有许多潜在的好处,但也存在风险和担忧。法学硕士可能被利用来传播虚假信息或操纵公众舆论,例如在全球卫生危机期间。法学硕士也从根本上接受了所有可用信息或数据的培训,包括不准确和偏差。这些不准确和偏差可以反映在幻觉的输出中,幻觉指的是生成文本中的错误,这些错误在语义或语法上是合理的,但实际上是不正确或荒谬的。法学硕士也存在隐私问题,因为他们可能会访问和处理敏感的个人数据。最终很难让法学硕士对他们的产出负责。因此,责任最终取决于用户。人类对法学硕士产出的监督和治理,特别是在医学和研究方面,是至关重要的。 通过临床试验,在医疗保健领域实施法学硕士必须遵守与任何其他新干预措施相同的严格性和标准,以证明法学硕士的应用至少不逊于目前的方法。最终,在医学和研究中使用法学硕士需要所有利益相关者共同承担责任,包括研究人员、技术公司、监管机构和整个社会。法学硕士的力量和潜力意味着它将继续存在,它的广泛实施是不可避免的。对其潜力的认识和合乎道德的实施对于确保负责任地使用它们并造福所有人至关重要。高元旭、Daniel T. Baptista-Hon和张康撰写了手稿。所有作者都阅读并批准了最终稿件。作者声明无利益冲突。
The inevitable transformation of medicine and research by large language models: The possibilities and pitfalls
Large language models (LLMs) often refer to artificial intelligence models that consist of extensive parameters and have the ability to understand and generate human-like language. They are typically developed in a self-supervised learning manner and are trained on large quantities of unlabeled text to learn patterns in language. LLMs were initially used in natural language processing (NLP), but they have since been extended to a variety of tasks like processing biological sequences and combining text with other modalities of data. LLMs have the potential to revolutionize the way we approach scientific research and medicine. For example, by leveraging their ability to understand and interpret vast quantities of text data, LLMs can provide insights and make predictions that would otherwise be impossible.
In the medical domain, LLMs can be used to analyze immense electronic health records and improve communication between healthcare professionals and patients. For example, LLMs can be used to automate triage, medical coding, and clinical documentation, which can help to improve the accuracy and efficiency of these processes. They can also be used to improve NLP in medical chatbots and virtual assistants, allowing patients to interact with healthcare services more efficiently and effectively. They can also be used to process medical records and patient data, enabling better diagnoses and more personalized treatments. They can also be used to analyze clinical trial data and identify trends that could lead to better outcomes. Finally, LLMs can also be used to answer medical questions and provide guidance to healthcare professionals, which can help to improve the quality of care. In the accompanying Review, Zheng et al.1 undertake a major effort to write a comprehensive review of this exciting and highly evolving field.
In research, LLMs can be used to search through diverse large datasets and identify patterns that would otherwise be difficult to detect. They can also be used to generate and test hypotheses and to summarize and analyze research papers. It is clear that LLMs will be transforming the way we communicate about medicine and research, and have the potential to revolutionize the field of healthcare.
The current state-of-the-art LLM is Generative Pre-trained Transformer 4 (GPT-4), developed by OpenAI, about which Technical details have not been made public yet.2 Based on publicly available information, the number of parameters is comparable to its previous generation, GPT-3, which consists of 175 billion parameters. GPT-4 is a generative model, meaning it can generate human-like language and even create original content. Other notable LLMs include GPT-3, Bidirectional Encoder Representations from Transformers, and Text-to-Text Transfer Transformers, each with its unique strengths and capabilities. However, one example of an LLM developed specifically for the medical domain is GatorTron,3 which can process and interpret electronic health records. GatorTron was developed by a team of researchers from the University of Florida. The model is trained on >90 billion words of text, including >82 billion words of deidentified clinical text. GatorTron achieves good performance on five clinical NLP tasks, including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference, and medical question answering. Besides, the results show that scaling up the number of parameters and the size of the training data can significantly improve the performance of these clinical NLP tasks. GatorTron's ability to accurately process unstructured clinical text can enhance medical AI systems and improve healthcare delivery. GatorTron is an example of the potential of LLMs to be tailored to specific domains or industries, allowing for more accurate and efficient language processing in specialized fields.
Despite the many potential benefits of LLMs in medicine and research, there are also risks and concerns. LLMs could be exploited for spreading false information or manipulating public opinion, such as during global health crises. LLMs are also fundamentally trained on all available information or data, including inaccuracies and biases. These inaccuracies and biases can be reflected in the output of hallucination, which refers to mistakes in the generated text that are semantically or syntactically plausible but are in fact incorrect or nonsensical. There are also privacy concerns with LLMs because they can potentially access and process sensitive personal data.
It is ultimately difficult to hold LLMs accountable for their outputs. Therefore, the accountability ultimately rests on the user. Human oversight and governance of LLM outputs, especially in medicine and research, is paramount. The implementation of LLMs in healthcare has to be subjected to the same rigor and standards as any other new interventions through clinical trials, to demonstrate that the application of LLMs is at least noninferior to current approaches. Ultimately, the use of LLMs in medicine and research requires a shared responsibility among all stakeholders, including researchers, technology companies, regulatory bodies, and society as a whole.
The power and potential of LLMs mean that it is here to stay, and its widespread implementation is inevitable. Recognition of its potential and ethical implementation is essential to ensure that they are used responsibly and for the benefit of all.
Yuanxu Gao, Daniel T. Baptista-Hon, and Kang Zhang wrote the manuscript. All authors have read and approved the final manuscript.