白手起家:利用大型语言模型为临床试验编写文档。

IF 2.2 3区 医学 Q3 MEDICINE, RESEARCH & EXPERIMENTAL
Nigel Markey, Ilyass El-Mansouri, Gaetan Rensonnet, Casper van Langen, Christoph Meier
{"title":"白手起家:利用大型语言模型为临床试验编写文档。","authors":"Nigel Markey, Ilyass El-Mansouri, Gaetan Rensonnet, Casper van Langen, Christoph Meier","doi":"10.1177/17407745251320806","DOIUrl":null,"url":null,"abstract":"<p><strong>Background/aims: </strong>Clinical trials require numerous documents to be written: Protocols, consent forms, clinical study reports, and many others. Large language models offer the potential to rapidly generate first-draft versions of these documents; however, there are concerns about the quality of their output. Here, we report an evaluation of how good large language models are at generating sections of one such document, clinical trial protocols.</p><p><strong>Methods: </strong>Using an off-the-shelf large language model, we generated protocol sections for a broad range of diseases and clinical trial phases. Each of these document sections we assessed across four dimensions: <i>Clinical thinking and logic; Transparency and references; Medical and clinical terminology</i>; and <i>Content relevance and suitability</i>. To improve performance, we used the retrieval-augmented generation method to enhance the large language model with accurate up-to-date information, including regulatory guidance documents and data from ClinicalTrials.gov. Using this retrieval-augmented generation large language model, we regenerated the same protocol sections and assessed them across the same four dimensions.</p><p><strong>Results: </strong>We find that the off-the-shelf large language model delivers reasonable results, especially when assessing <i>content relevance</i> and the <i>correct use of medical and clinical terminology</i>, with scores of over 80%. However, the off-the-shelf large language model shows limited performance in <i>clinical thinking and logic</i> and <i>transparency and references</i>, with assessment scores of ≈40% or less. The use of retrieval-augmented generation substantially improves the writing quality of the large language model, with <i>clinical thinking and logic</i> and <i>transparency and references</i> scores increasing to ≈80%. The retrieval-augmented generation method thus greatly improves the practical usability of large language models for clinical trial-related writing.</p><p><strong>Discussion: </strong>Our results suggest that hybrid large language model architectures, such as the retrieval-augmented generation method we utilized, offer strong potential for clinical trial-related writing, including a wide variety of documents. This is potentially transformative, since it addresses several major bottlenecks of drug development.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"17407745251320806"},"PeriodicalIF":2.2000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"From RAGs to riches: Utilizing large language models to write documents for clinical trials.\",\"authors\":\"Nigel Markey, Ilyass El-Mansouri, Gaetan Rensonnet, Casper van Langen, Christoph Meier\",\"doi\":\"10.1177/17407745251320806\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background/aims: </strong>Clinical trials require numerous documents to be written: Protocols, consent forms, clinical study reports, and many others. Large language models offer the potential to rapidly generate first-draft versions of these documents; however, there are concerns about the quality of their output. Here, we report an evaluation of how good large language models are at generating sections of one such document, clinical trial protocols.</p><p><strong>Methods: </strong>Using an off-the-shelf large language model, we generated protocol sections for a broad range of diseases and clinical trial phases. Each of these document sections we assessed across four dimensions: <i>Clinical thinking and logic; Transparency and references; Medical and clinical terminology</i>; and <i>Content relevance and suitability</i>. To improve performance, we used the retrieval-augmented generation method to enhance the large language model with accurate up-to-date information, including regulatory guidance documents and data from ClinicalTrials.gov. Using this retrieval-augmented generation large language model, we regenerated the same protocol sections and assessed them across the same four dimensions.</p><p><strong>Results: </strong>We find that the off-the-shelf large language model delivers reasonable results, especially when assessing <i>content relevance</i> and the <i>correct use of medical and clinical terminology</i>, with scores of over 80%. However, the off-the-shelf large language model shows limited performance in <i>clinical thinking and logic</i> and <i>transparency and references</i>, with assessment scores of ≈40% or less. The use of retrieval-augmented generation substantially improves the writing quality of the large language model, with <i>clinical thinking and logic</i> and <i>transparency and references</i> scores increasing to ≈80%. The retrieval-augmented generation method thus greatly improves the practical usability of large language models for clinical trial-related writing.</p><p><strong>Discussion: </strong>Our results suggest that hybrid large language model architectures, such as the retrieval-augmented generation method we utilized, offer strong potential for clinical trial-related writing, including a wide variety of documents. This is potentially transformative, since it addresses several major bottlenecks of drug development.</p>\",\"PeriodicalId\":10685,\"journal\":{\"name\":\"Clinical Trials\",\"volume\":\" \",\"pages\":\"17407745251320806\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2025-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Trials\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1177/17407745251320806\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MEDICINE, RESEARCH & EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Trials","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/17407745251320806","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

摘要

背景/目的:临床试验需要撰写大量文件:方案、同意表、临床研究报告等。大型语言模型提供了快速生成这些文档的初稿版本的潜力;然而,他们的产出质量令人担忧。在这里,我们报告了对大型语言模型在生成一个这样的文档(临床试验协议)的部分方面有多好的评估。方法:使用现成的大型语言模型,我们为广泛的疾病和临床试验阶段生成协议章节。我们从四个方面评估了这些文档的每个部分:临床思维和逻辑;透明度和参考资料;医学和临床术语;内容的相关性和适宜性。为了提高性能,我们使用检索增强生成方法,用准确的最新信息增强大型语言模型,包括监管指导文件和来自ClinicalTrials.gov的数据。使用这个检索增强生成大型语言模型,我们重新生成了相同的协议部分,并在相同的四个维度上对它们进行了评估。结果:我们发现现成的大语言模型提供了合理的结果,特别是在评估内容相关性和医学和临床术语的正确使用时,得分超过80%。然而,现成的大型语言模型在临床思维和逻辑以及透明度和参考方面的表现有限,评估分数约为40%或更低。检索增强生成的使用大大提高了大型语言模型的写作质量,临床思维和逻辑以及透明度和参考文献得分提高到≈80%。因此,检索增强生成方法大大提高了临床试验相关写作的大型语言模型的实际可用性。讨论:我们的结果表明,混合大型语言模型体系结构,如我们使用的检索增强生成方法,为临床试验相关的写作提供了强大的潜力,包括各种各样的文档。这是潜在的变革,因为它解决了药物开发的几个主要瓶颈。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
From RAGs to riches: Utilizing large language models to write documents for clinical trials.

Background/aims: Clinical trials require numerous documents to be written: Protocols, consent forms, clinical study reports, and many others. Large language models offer the potential to rapidly generate first-draft versions of these documents; however, there are concerns about the quality of their output. Here, we report an evaluation of how good large language models are at generating sections of one such document, clinical trial protocols.

Methods: Using an off-the-shelf large language model, we generated protocol sections for a broad range of diseases and clinical trial phases. Each of these document sections we assessed across four dimensions: Clinical thinking and logic; Transparency and references; Medical and clinical terminology; and Content relevance and suitability. To improve performance, we used the retrieval-augmented generation method to enhance the large language model with accurate up-to-date information, including regulatory guidance documents and data from ClinicalTrials.gov. Using this retrieval-augmented generation large language model, we regenerated the same protocol sections and assessed them across the same four dimensions.

Results: We find that the off-the-shelf large language model delivers reasonable results, especially when assessing content relevance and the correct use of medical and clinical terminology, with scores of over 80%. However, the off-the-shelf large language model shows limited performance in clinical thinking and logic and transparency and references, with assessment scores of ≈40% or less. The use of retrieval-augmented generation substantially improves the writing quality of the large language model, with clinical thinking and logic and transparency and references scores increasing to ≈80%. The retrieval-augmented generation method thus greatly improves the practical usability of large language models for clinical trial-related writing.

Discussion: Our results suggest that hybrid large language model architectures, such as the retrieval-augmented generation method we utilized, offer strong potential for clinical trial-related writing, including a wide variety of documents. This is potentially transformative, since it addresses several major bottlenecks of drug development.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Clinical Trials
Clinical Trials 医学-医学:研究与实验
CiteScore
4.10
自引率
3.70%
发文量
82
审稿时长
6-12 weeks
期刊介绍: Clinical Trials is dedicated to advancing knowledge on the design and conduct of clinical trials related research methodologies. Covering the design, conduct, analysis, synthesis and evaluation of key methodologies, the journal remains on the cusp of the latest topics, including ethics, regulation and policy impact.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信