Using Large Language Models for Advanced and Flexible Labelling of Protocol Deviations in Clinical Development.

IF 1.9 4区医学 Q4 MEDICAL INFORMATICS

Therapeutic innovation & regulatory science Pub Date : 2025-07-01 Epub Date: 2025-05-13 DOI:10.1007/s43441-025-00785-z

Min Zou, Leszek Popko, Michelle Gaudio

{"title":"Using Large Language Models for Advanced and Flexible Labelling of Protocol Deviations in Clinical Development.","authors":"Min Zou, Leszek Popko, Michelle Gaudio","doi":"10.1007/s43441-025-00785-z","DOIUrl":null,"url":null,"abstract":"Background: As described in ICH E3 Q&A R1 (International Council for Harmonisation. E3: Structure and content of clinical study reports-questions and answers (R1). 6 July 2012. Available from: https://database.ich.org/sites/default/files/E3_Q%26As_R1_Q%26As.pdf ): \"A protocol deviation (PD) is any change, divergence, or departure from the study design or procedures defined in the protocol\". A problematic area in human subject protection is the wide divergence among institutions, sponsors, investigators and IRBs regarding the definition of and the procedures for reviewing PDs. Despite industry initiatives like TransCelerate's holistic approach [Galuchie et al. in Ther Innov Regul Sci 55:733-742, 2021], systematic trending and identification of impactful PDs remains limited. Traditional Natural Language Processing (NLP) methods are often cumbersome to implement, requiring extensive feature engineering and model tuning. However, the rise of Large Language Models (LLMs) has revolutionised text classification, enabling more accurate, nuanced, and context-aware solutions [Nguyen P. Test classification in the age of LLMs. 2024. Available from: https://blog.redsift.com/author/phong/ ]. An automated classification solution that enables efficient, flexible, and targeted PD classification is currently lacking.Methods: We developed a novel approach using a large language model (LLM), Meta Llama2 [Meta. Llama 2: Open source, free for research and commercial use. 2023. Available from: https://www.llama.com/llama2/ ] with a tailored prompt to classify free-text PDs from Roches' PD management system. The model outputs were analysed to identify trends and assess risks across clinical programs, supporting human decision-making. This method offers a generalisable framework for developing prompts and integrating data to address similar challenges in clinical development.Result: This approach flagged over 80% of PDs potentially affecting disease progression assessment, enabling expert review. Compared to months of manual analysis, this automated method produced actionable insights in minutes. The solution also highlighted gaps in first-line controls, supporting process improvement and better accuracy in disease progression handling during trials.","PeriodicalId":23084,"journal":{"name":"Therapeutic innovation & regulatory science","volume":" ","pages":"833-847"},"PeriodicalIF":1.9000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12181094/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Therapeutic innovation & regulatory science","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s43441-025-00785-z","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/13 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: As described in ICH E3 Q&A R1 (International Council for Harmonisation. E3: Structure and content of clinical study reports-questions and answers (R1). 6 July 2012. Available from: https://database.ich.org/sites/default/files/E3_Q%26As_R1_Q%26As.pdf ): "A protocol deviation (PD) is any change, divergence, or departure from the study design or procedures defined in the protocol". A problematic area in human subject protection is the wide divergence among institutions, sponsors, investigators and IRBs regarding the definition of and the procedures for reviewing PDs. Despite industry initiatives like TransCelerate's holistic approach [Galuchie et al. in Ther Innov Regul Sci 55:733-742, 2021], systematic trending and identification of impactful PDs remains limited. Traditional Natural Language Processing (NLP) methods are often cumbersome to implement, requiring extensive feature engineering and model tuning. However, the rise of Large Language Models (LLMs) has revolutionised text classification, enabling more accurate, nuanced, and context-aware solutions [Nguyen P. Test classification in the age of LLMs. 2024. Available from: https://blog.redsift.com/author/phong/ ]. An automated classification solution that enables efficient, flexible, and targeted PD classification is currently lacking.

Methods: We developed a novel approach using a large language model (LLM), Meta Llama2 [Meta. Llama 2: Open source, free for research and commercial use. 2023. Available from: https://www.llama.com/llama2/ ] with a tailored prompt to classify free-text PDs from Roches' PD management system. The model outputs were analysed to identify trends and assess risks across clinical programs, supporting human decision-making. This method offers a generalisable framework for developing prompts and integrating data to address similar challenges in clinical development.

Result: This approach flagged over 80% of PDs potentially affecting disease progression assessment, enabling expert review. Compared to months of manual analysis, this automated method produced actionable insights in minutes. The solution also highlighted gaps in first-line controls, supporting process improvement and better accuracy in disease progression handling during trials.

Abstract Image

查看原文本刊更多论文

在临床开发中使用大型语言模型对方案偏差进行先进和灵活的标记。

背景：如ICH E3 Q&A R1（国际协调委员会）所述。E3：临床研究报告的结构和内容-问答（R1）。2012年7月6日。可从：https://database.ich.org/sites/default/files/E3_Q%26As_R1_Q%26As.pdf)：“方案偏差（PD）是对方案中定义的研究设计或程序的任何更改、偏离或偏离”。人体受试者保护的一个有问题的领域是机构、发起人、调查人员和内部审查委员会之间对pd的定义和审查程序的广泛分歧。尽管transelerate的整体方法（Galuchie et al. in theinnovv reggul Sci 55:733-742, 2021）等行业倡议，但对有影响的pd的系统趋势和识别仍然有限。传统的自然语言处理方法往往难以实现，需要大量的特征工程和模型调优。然而，大型语言模型（llm）的兴起已经彻底改变了文本分类，使更准确、细致和上下文感知的解决方案成为可能[Nguyen P. llm时代的测试分类]。2024. 可从：https://blog.redsift.com/author/phong/]获得。目前还缺乏一种能够实现高效、灵活和有针对性的PD分类的自动分类解决方案。方法：我们开发了一种使用大型语言模型（LLM）的新方法，Meta Llama2 [Meta。Llama 2：开源，免费用于研究和商业用途。2023. 可从：https://www.llama.com/llama2/]获得，并提供定制的提示，从罗氏PD管理系统对自由文本PD进行分类。对模型输出进行分析，以确定临床项目的趋势和评估风险，为人类决策提供支持。该方法为开发提示和整合数据提供了一个通用的框架，以解决临床开发中的类似挑战。结果：该方法标记了80%以上可能影响疾病进展评估的pd，使专家审查成为可能。与几个月的人工分析相比，这种自动化的方法在几分钟内就产生了可操作的见解。该解决方案还突出了一线控制的差距，支持流程改进，并在试验期间提高疾病进展处理的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Therapeutic innovation & regulatory science MEDICAL INFORMATICS-PHARMACOLOGY & PHARMACY

CiteScore

3.40

自引率

13.30%

发文量

127

期刊介绍： Therapeutic Innovation & Regulatory Science (TIRS) is the official scientific journal of DIA that strives to advance medical product discovery, development, regulation, and use through the publication of peer-reviewed original and review articles, commentaries, and letters to the editor across the spectrum of converting biomedical science into practical solutions to advance human health. The focus areas of the journal are as follows: Biostatistics Clinical Trials Product Development and Innovation Global Perspectives Policy Regulatory Science Product Safety Special Populations