Machine learning approach for detection of MACE events within clinical trial data.

IF 1.2 4区医学 Q4 PHARMACOLOGY & PHARMACY

Journal of Biopharmaceutical Statistics Pub Date : 2024-11-16 DOI:10.1080/10543406.2024.2420640

John A Spanias, Robbie Buderi, Pierre-Louis Bourlon, Christopher Tso, Caleb Strait, David Saunders, Kayleen Ports, Weixi Chen, Rahul Jain, Bhargav Koduru, Danielle Gerome, Eric Yang, Silvy Saltzmann, Aniketh Talwai, Tanmay Jain, Jacob Aptekar

{"title":"Machine learning approach for detection of MACE events within clinical trial data.","authors":"John A Spanias, Robbie Buderi, Pierre-Louis Bourlon, Christopher Tso, Caleb Strait, David Saunders, Kayleen Ports, Weixi Chen, Rahul Jain, Bhargav Koduru, Danielle Gerome, Eric Yang, Silvy Saltzmann, Aniketh Talwai, Tanmay Jain, Jacob Aptekar","doi":"10.1080/10543406.2024.2420640","DOIUrl":null,"url":null,"abstract":"<p><p>Randomized controlled trials (RCTs) are the gold standard for clinical research but may not accurately reflect the impact of medicines in real-world settings. Supplementing RCTs with insights from real-world data (RWD) can address known limitations by including more diverse patient populations, additional types of sites-of-care, and practices more representative of the care most people receive. One current challenge in using RWD is the lack of an algorithmic approach to identifying outcomes. To address this, machine learning models for identifying a frequently used outcome, Major Adverse Cardiovascular Events (MACE), were developed in Clinical Trial Data (CTD). Anonymized CTD sourced from the Medidata Enterprise Data Store were used to develop model features on the condition that they would be useful for labelling MACE events and that they could also be found in RWD. These features were used to train three random forest models to identify each component of 3-point MACE in a patient's clinical trial journey. Performance metrics for the models are presented (recall = 0.72 [0.07], precision = 0.68 [0.12] - mean, [SD]) along with the top contributing features. We show that the models can be tuned specifically to replicate the adjudication panels' results and present a cost-benefit analysis for deploying such models in clinical trial settings. We demonstrate the viability of using advanced algorithms for identifying clinical outcomes in prospective clinical trials. Deployment of such models could reduce the resources required to conduct RCTs. Extending such models to RWD would facilitate approval of pragmatic clinical trials for regulatory submissions.</p>","PeriodicalId":54870,"journal":{"name":"Journal of Biopharmaceutical Statistics","volume":" ","pages":"1-16"},"PeriodicalIF":1.2000,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biopharmaceutical Statistics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/10543406.2024.2420640","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}

引用次数: 0

Abstract

Randomized controlled trials (RCTs) are the gold standard for clinical research but may not accurately reflect the impact of medicines in real-world settings. Supplementing RCTs with insights from real-world data (RWD) can address known limitations by including more diverse patient populations, additional types of sites-of-care, and practices more representative of the care most people receive. One current challenge in using RWD is the lack of an algorithmic approach to identifying outcomes. To address this, machine learning models for identifying a frequently used outcome, Major Adverse Cardiovascular Events (MACE), were developed in Clinical Trial Data (CTD). Anonymized CTD sourced from the Medidata Enterprise Data Store were used to develop model features on the condition that they would be useful for labelling MACE events and that they could also be found in RWD. These features were used to train three random forest models to identify each component of 3-point MACE in a patient's clinical trial journey. Performance metrics for the models are presented (recall = 0.72 [0.07], precision = 0.68 [0.12] - mean, [SD]) along with the top contributing features. We show that the models can be tuned specifically to replicate the adjudication panels' results and present a cost-benefit analysis for deploying such models in clinical trial settings. We demonstrate the viability of using advanced algorithms for identifying clinical outcomes in prospective clinical trials. Deployment of such models could reduce the resources required to conduct RCTs. Extending such models to RWD would facilitate approval of pragmatic clinical trials for regulatory submissions.

查看原文本刊更多论文

在临床试验数据中检测 MACE 事件的机器学习方法。

随机对照试验（RCT）是临床研究的黄金标准，但可能无法准确反映药物在真实世界环境中的影响。用真实世界数据（RWD）来补充随机对照试验，可以解决已知的局限性问题，包括更多样化的患者群体、更多类型的医疗机构以及更能代表大多数人所接受的医疗实践。目前使用 RWD 所面临的一个挑战是缺乏确定结果的算法方法。为了解决这个问题，我们在临床试验数据 (CTD) 中开发了机器学习模型，用于识别一种常用的结果，即主要不良心血管事件 (MACE)。来自 Medidata 企业数据存储库的匿名 CTD 被用来开发模型特征，条件是这些特征对标记 MACE 事件有用，而且也能在 RWD 中找到。这些特征被用于训练三个随机森林模型，以识别患者临床试验过程中 3 点 MACE 的每个组成部分。模型的性能指标（召回率 = 0.72 [0.07]，精确度 = 0.68 [0.12] - 平均值，[SD]）以及贡献最大的特征一并列出。我们表明，可以对模型进行专门调整，以复制评审小组的结果，并提出了在临床试验环境中部署此类模型的成本效益分析。我们证明了在前瞻性临床试验中使用先进算法识别临床结果的可行性。部署此类模型可以减少开展 RCT 所需的资源。将此类模型扩展到 RWD 将有助于批准监管部门提交的务实临床试验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Biopharmaceutical Statistics 医学-统计学与概率论

CiteScore

2.50

自引率

18.20%

发文量

审稿时长

6-12 weeks

期刊介绍： The Journal of Biopharmaceutical Statistics, a rapid publication journal, discusses quality applications of statistics in biopharmaceutical research and development. Now publishing six times per year, it includes expositions of statistical methodology with immediate applicability to biopharmaceutical research in the form of full-length and short manuscripts, review articles, selected/invited conference papers, short articles, and letters to the editor. Addressing timely and provocative topics important to the biostatistical profession, the journal covers: Drug, device, and biological research and development; Drug screening and drug design; Assessment of pharmacological activity; Pharmaceutical formulation and scale-up; Preclinical safety assessment; Bioavailability, bioequivalence, and pharmacokinetics; Phase, I, II, and III clinical development including complex innovative designs; Premarket approval assessment of clinical safety; Postmarketing surveillance; Big data and artificial intelligence and applications.