Analysis of eligibility criteria clusters based on large language models for clinical trial design.

IF 4.7 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Alban Bornet, Philipp Khlebnikov, Florian Meer, Quentin Haas, Anthony Yazdani, Boya Zhang, Poorya Amini, Douglas Teodoro
{"title":"Analysis of eligibility criteria clusters based on large language models for clinical trial design.","authors":"Alban Bornet, Philipp Khlebnikov, Florian Meer, Quentin Haas, Anthony Yazdani, Boya Zhang, Poorya Amini, Douglas Teodoro","doi":"10.1093/jamia/ocae311","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Clinical trials (CTs) are essential for improving patient care by evaluating new treatments' safety and efficacy. A key component in CT protocols is the study population defined by the eligibility criteria. This study aims to evaluate the effectiveness of large language models (LLMs) in encoding eligibility criterion information to support CT-protocol design.</p><p><strong>Materials and methods: </strong>We extracted eligibility criterion sections, phases, conditions, and interventions from CT protocols available in the ClinicalTrials.gov registry. Eligibility sections were split into individual rules using a criterion tokenizer and embedded using LLMs. The obtained representations were clustered. The quality and relevance of the clusters for protocol design was evaluated through 3 experiments: intrinsic alignment with protocol information and human expert cluster coherence assessment, extrinsic evaluation through CT-level classification tasks, and eligibility section generation.</p><p><strong>Results: </strong>Sentence embeddings fine-tuned using biomedical corpora produce clusters with the highest alignment to CT-level information. Human expert evaluation confirms that clusters are well structured and coherent. Despite the high information compression, clusters retain significant CT information, up to 97% of the classification performance obtained with raw embeddings. Finally, eligibility sections automatically generated using clusters achieve 95% of the ROUGE scores obtained with a generative LLM prompted with CT-protocol details, suggesting that clusters encapsulate information useful to CT-protocol design.</p><p><strong>Discussion: </strong>Clusters derived from sentence-level LLM embeddings effectively summarize complex eligibility criterion data while retaining relevant CT-protocol details. Clustering-based approaches provide a scalable enhancement in CT design that balances information compression with accuracy.</p><p><strong>Conclusions: </strong>Clustering eligibility criteria using LLM embeddings provides a practical and efficient method to summarize critical protocol information. We provide an interactive visualization of the pipeline here.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocae311","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: Clinical trials (CTs) are essential for improving patient care by evaluating new treatments' safety and efficacy. A key component in CT protocols is the study population defined by the eligibility criteria. This study aims to evaluate the effectiveness of large language models (LLMs) in encoding eligibility criterion information to support CT-protocol design.

Materials and methods: We extracted eligibility criterion sections, phases, conditions, and interventions from CT protocols available in the ClinicalTrials.gov registry. Eligibility sections were split into individual rules using a criterion tokenizer and embedded using LLMs. The obtained representations were clustered. The quality and relevance of the clusters for protocol design was evaluated through 3 experiments: intrinsic alignment with protocol information and human expert cluster coherence assessment, extrinsic evaluation through CT-level classification tasks, and eligibility section generation.

Results: Sentence embeddings fine-tuned using biomedical corpora produce clusters with the highest alignment to CT-level information. Human expert evaluation confirms that clusters are well structured and coherent. Despite the high information compression, clusters retain significant CT information, up to 97% of the classification performance obtained with raw embeddings. Finally, eligibility sections automatically generated using clusters achieve 95% of the ROUGE scores obtained with a generative LLM prompted with CT-protocol details, suggesting that clusters encapsulate information useful to CT-protocol design.

Discussion: Clusters derived from sentence-level LLM embeddings effectively summarize complex eligibility criterion data while retaining relevant CT-protocol details. Clustering-based approaches provide a scalable enhancement in CT design that balances information compression with accuracy.

Conclusions: Clustering eligibility criteria using LLM embeddings provides a practical and efficient method to summarize critical protocol information. We provide an interactive visualization of the pipeline here.

基于大型语言模型的临床试验设计合格标准聚类分析。
目的:临床试验(ct)是通过评估新疗法的安全性和有效性来改善患者护理的必要条件。CT方案的一个关键组成部分是由资格标准定义的研究人群。本研究旨在评估大型语言模型(LLMs)在编码合格标准信息以支持ct协议设计方面的有效性。材料和方法:我们从ClinicalTrials.gov注册中心的CT方案中提取合格标准部分、阶段、条件和干预措施。资格部分使用标准标记器划分为单独的规则,并使用llm嵌入。对得到的表示进行聚类。通过3个实验来评估方案设计聚类的质量和相关性:与方案信息和人类专家聚类一致性评估的内在一致性,通过ct级分类任务进行的外在评估,以及资格截面生成。结果:使用生物医学语料库对句子嵌入进行微调,产生与ct级信息最高对齐的聚类。人类专家的评估证实,集群结构良好,连贯。尽管信息压缩程度很高,但聚类仍然保留了大量的CT信息,其分类性能达到原始嵌入的97%。最后,使用集群自动生成的合格性部分达到了95%的ROUGE分数,而ROUGE分数是由提示ct协议细节的生成式LLM获得的,这表明集群封装了对ct协议设计有用的信息。讨论:来自句子级LLM嵌入的聚类有效地总结了复杂的资格标准数据,同时保留了相关的ct协议细节。基于聚类的方法为CT设计提供了可扩展的增强,平衡了信息压缩和准确性。结论:使用LLM嵌入的聚类资格标准为总结关键协议信息提供了实用而有效的方法。我们在这里提供了管道的交互式可视化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of the American Medical Informatics Association
Journal of the American Medical Informatics Association 医学-计算机:跨学科应用
CiteScore
14.50
自引率
7.80%
发文量
230
审稿时长
3-8 weeks
期刊介绍: JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信