Hongyu Chen, Xiaohan Li, Xing He, Aokun Chen, James McGill, Emily C Webber, Hua Xu, Mei Liu, Jiang Bian
{"title":"用大型语言模型增强患者-试验匹配:对新兴应用和方法的范围审查。","authors":"Hongyu Chen, Xiaohan Li, Xing He, Aokun Chen, James McGill, Emily C Webber, Hua Xu, Mei Liu, Jiang Bian","doi":"10.1200/CCI-25-00071","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Patient recruitment remains a major bottleneck in clinical trial execution, with inefficient patient-trial matching often causing delays and failures. Recent advancements in large language models (LLMs) offer a promising avenue for automating and improving this process. This scoping review aims to provide a comprehensive synthesis of the emerging applications of LLMs in patient-trial matching.</p><p><strong>Methods: </strong>A comprehensive search was conducted in PubMed, Web of Science, and OpenAlex for literature published between December 1, 2022, and December 31, 2024. Studies were included if they explicitly integrated LLMs into patient-trial matching systems. Data extraction focused on system architectures, patient data processing, eligibility criteria processing, matching techniques, evaluation metrics, and performance.</p><p><strong>Results: </strong>Of the 2,357 studies initially identified, 24 met the inclusion criteria. The majority (21/24) were published in 2024, highlighting the rapid adoption of LLMs in this domain. Most systems used patient-centric matching (17/24), with OpenAI's generative pretrained transformer models being the most commonly used LLM. Core components of these systems included eligibility criteria processing, patient data processing, and matching, with some incorporating retrieval algorithms to enhance computational efficiency. LLM-integrated approaches demonstrated improved accuracy and scalability in patient-trial matching, although challenges such as performance variability, interpretability, and reliance on synthetic data sets remain significant.</p><p><strong>Conclusion: </strong>LLM-based patient-trial matching systems present a transformative opportunity to enhance the efficiency and accuracy of clinical trial recruitment. Despite current limitations related to model generalizability, explainability, and data constraints, future advancements in hybrid modeling strategies, domain-specific fine-tuning, and real-world data set integration could further optimize LLM-based trial matching. Addressing these challenges will be crucial to realizing the full potential of LLMs in streamlining patient recruitment and accelerating clinical trial execution.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500071"},"PeriodicalIF":3.3000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12169815/pdf/","citationCount":"0","resultStr":"{\"title\":\"Enhancing Patient-Trial Matching With Large Language Models: A Scoping Review of Emerging Applications and Approaches.\",\"authors\":\"Hongyu Chen, Xiaohan Li, Xing He, Aokun Chen, James McGill, Emily C Webber, Hua Xu, Mei Liu, Jiang Bian\",\"doi\":\"10.1200/CCI-25-00071\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Patient recruitment remains a major bottleneck in clinical trial execution, with inefficient patient-trial matching often causing delays and failures. Recent advancements in large language models (LLMs) offer a promising avenue for automating and improving this process. This scoping review aims to provide a comprehensive synthesis of the emerging applications of LLMs in patient-trial matching.</p><p><strong>Methods: </strong>A comprehensive search was conducted in PubMed, Web of Science, and OpenAlex for literature published between December 1, 2022, and December 31, 2024. Studies were included if they explicitly integrated LLMs into patient-trial matching systems. Data extraction focused on system architectures, patient data processing, eligibility criteria processing, matching techniques, evaluation metrics, and performance.</p><p><strong>Results: </strong>Of the 2,357 studies initially identified, 24 met the inclusion criteria. The majority (21/24) were published in 2024, highlighting the rapid adoption of LLMs in this domain. Most systems used patient-centric matching (17/24), with OpenAI's generative pretrained transformer models being the most commonly used LLM. Core components of these systems included eligibility criteria processing, patient data processing, and matching, with some incorporating retrieval algorithms to enhance computational efficiency. LLM-integrated approaches demonstrated improved accuracy and scalability in patient-trial matching, although challenges such as performance variability, interpretability, and reliance on synthetic data sets remain significant.</p><p><strong>Conclusion: </strong>LLM-based patient-trial matching systems present a transformative opportunity to enhance the efficiency and accuracy of clinical trial recruitment. Despite current limitations related to model generalizability, explainability, and data constraints, future advancements in hybrid modeling strategies, domain-specific fine-tuning, and real-world data set integration could further optimize LLM-based trial matching. Addressing these challenges will be crucial to realizing the full potential of LLMs in streamlining patient recruitment and accelerating clinical trial execution.</p>\",\"PeriodicalId\":51626,\"journal\":{\"name\":\"JCO Clinical Cancer Informatics\",\"volume\":\"9 \",\"pages\":\"e2500071\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12169815/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JCO Clinical Cancer Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1200/CCI-25-00071\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/6/9 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI-25-00071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/9 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
目的:患者招募仍然是临床试验执行的主要瓶颈,低效的患者-试验匹配经常导致延迟和失败。大型语言模型(llm)的最新进展为自动化和改进这一过程提供了一条有希望的途径。本综述旨在提供法学硕士在患者-试验匹配中的新兴应用的全面综合。方法:在PubMed、Web of Science和OpenAlex中全面检索2022年12月1日至2024年12月31日之间发表的文献。如果研究明确地将法学硕士纳入患者-试验匹配系统,则纳入研究。数据提取侧重于系统架构、患者数据处理、资格标准处理、匹配技术、评估指标和性能。结果:在最初确定的2357项研究中,24项符合纳入标准。大多数(21/24)发表于2024年,突出了法学硕士在该领域的快速采用。大多数系统使用以患者为中心的匹配(17/24),OpenAI的生成式预训练变压器模型是最常用的LLM。这些系统的核心组件包括资格标准处理、患者数据处理和匹配,并结合一些检索算法来提高计算效率。llm集成方法在患者-试验匹配中显示出更高的准确性和可扩展性,尽管诸如性能可变性、可解释性和对合成数据集的依赖等挑战仍然显著。结论:基于法学硕士的患者-试验匹配系统为提高临床试验招募的效率和准确性提供了一个变革的机会。尽管目前存在与模型通用性、可解释性和数据约束相关的限制,但混合建模策略、特定领域微调和现实世界数据集集成的未来进展可以进一步优化基于llm的试验匹配。解决这些挑战对于实现法学硕士在简化患者招募和加速临床试验执行方面的全部潜力至关重要。
Enhancing Patient-Trial Matching With Large Language Models: A Scoping Review of Emerging Applications and Approaches.
Purpose: Patient recruitment remains a major bottleneck in clinical trial execution, with inefficient patient-trial matching often causing delays and failures. Recent advancements in large language models (LLMs) offer a promising avenue for automating and improving this process. This scoping review aims to provide a comprehensive synthesis of the emerging applications of LLMs in patient-trial matching.
Methods: A comprehensive search was conducted in PubMed, Web of Science, and OpenAlex for literature published between December 1, 2022, and December 31, 2024. Studies were included if they explicitly integrated LLMs into patient-trial matching systems. Data extraction focused on system architectures, patient data processing, eligibility criteria processing, matching techniques, evaluation metrics, and performance.
Results: Of the 2,357 studies initially identified, 24 met the inclusion criteria. The majority (21/24) were published in 2024, highlighting the rapid adoption of LLMs in this domain. Most systems used patient-centric matching (17/24), with OpenAI's generative pretrained transformer models being the most commonly used LLM. Core components of these systems included eligibility criteria processing, patient data processing, and matching, with some incorporating retrieval algorithms to enhance computational efficiency. LLM-integrated approaches demonstrated improved accuracy and scalability in patient-trial matching, although challenges such as performance variability, interpretability, and reliance on synthetic data sets remain significant.
Conclusion: LLM-based patient-trial matching systems present a transformative opportunity to enhance the efficiency and accuracy of clinical trial recruitment. Despite current limitations related to model generalizability, explainability, and data constraints, future advancements in hybrid modeling strategies, domain-specific fine-tuning, and real-world data set integration could further optimize LLM-based trial matching. Addressing these challenges will be crucial to realizing the full potential of LLMs in streamlining patient recruitment and accelerating clinical trial execution.