Federico Amato, Rainer Strotmann, Roberto Castello, Rolf Bruns, Vishal Ghori, Andreas Johne, Karin Berghoff, Karthik Venkatakrishnan, Nadia Terranova
{"title":"可解释的机器学习预测特泊替尼治疗患者的水肿不良事件。","authors":"Federico Amato, Rainer Strotmann, Roberto Castello, Rolf Bruns, Vishal Ghori, Andreas Johne, Karin Berghoff, Karthik Venkatakrishnan, Nadia Terranova","doi":"10.1111/cts.70010","DOIUrl":null,"url":null,"abstract":"<p>Tepotinib is approved for the treatment of patients with non-small-cell lung cancer harboring <i>MET</i> exon 14 skipping alterations. While edema is the most prevalent adverse event (AE) and a known class effect of MET inhibitors including tepotinib, there is still limited understanding about the factors contributing to its occurrence. Herein, we apply machine learning (ML)-based approaches to predict the likelihood of occurrence of edema in patients undergoing tepotinib treatment, and to identify factors influencing its development over time. Data from 612 patients receiving tepotinib in five Phase I/II studies were modeled with two ML algorithms, Random Forest, and Gradient Boosting Trees, to predict edema AE incidence and severity. Probability calibration was applied to give a realistic estimation of the likelihood of edema AE. Best model was tested on follow-up data and on data from clinical studies unused while training. Results showed high performances across all the tested settings, with F1 scores up to 0.961 when retraining the model with the most relevant covariates. The use of ML explainability methods identified serum albumin as the most informative longitudinal covariate, and higher age as associated with higher probabilities of more severe edema. The developed methodological framework enables the use of ML algorithms for analyzing clinical safety data and exploiting longitudinal information through various covariate engineering approaches. Probability calibration ensures the accurate estimation of the likelihood of the AE occurrence, while explainability tools can identify factors contributing to model predictions, hence supporting population and individual patient-level interpretation.</p>","PeriodicalId":50610,"journal":{"name":"Cts-Clinical and Translational Science","volume":"17 9","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cts.70010","citationCount":"0","resultStr":"{\"title\":\"Explainable machine learning prediction of edema adverse events in patients treated with tepotinib\",\"authors\":\"Federico Amato, Rainer Strotmann, Roberto Castello, Rolf Bruns, Vishal Ghori, Andreas Johne, Karin Berghoff, Karthik Venkatakrishnan, Nadia Terranova\",\"doi\":\"10.1111/cts.70010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Tepotinib is approved for the treatment of patients with non-small-cell lung cancer harboring <i>MET</i> exon 14 skipping alterations. While edema is the most prevalent adverse event (AE) and a known class effect of MET inhibitors including tepotinib, there is still limited understanding about the factors contributing to its occurrence. Herein, we apply machine learning (ML)-based approaches to predict the likelihood of occurrence of edema in patients undergoing tepotinib treatment, and to identify factors influencing its development over time. Data from 612 patients receiving tepotinib in five Phase I/II studies were modeled with two ML algorithms, Random Forest, and Gradient Boosting Trees, to predict edema AE incidence and severity. Probability calibration was applied to give a realistic estimation of the likelihood of edema AE. Best model was tested on follow-up data and on data from clinical studies unused while training. Results showed high performances across all the tested settings, with F1 scores up to 0.961 when retraining the model with the most relevant covariates. The use of ML explainability methods identified serum albumin as the most informative longitudinal covariate, and higher age as associated with higher probabilities of more severe edema. The developed methodological framework enables the use of ML algorithms for analyzing clinical safety data and exploiting longitudinal information through various covariate engineering approaches. Probability calibration ensures the accurate estimation of the likelihood of the AE occurrence, while explainability tools can identify factors contributing to model predictions, hence supporting population and individual patient-level interpretation.</p>\",\"PeriodicalId\":50610,\"journal\":{\"name\":\"Cts-Clinical and Translational Science\",\"volume\":\"17 9\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cts.70010\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cts-Clinical and Translational Science\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/cts.70010\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICINE, RESEARCH & EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cts-Clinical and Translational Science","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cts.70010","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
摘要
特罗替尼被批准用于治疗携带MET 14外显子跳越改变的非小细胞肺癌患者。虽然水肿是最常见的不良事件(AE),也是包括特罗替尼在内的MET抑制剂的已知类效应,但人们对导致水肿发生的因素了解仍然有限。在此,我们应用基于机器学习(ML)的方法来预测接受替波替尼治疗的患者发生水肿的可能性,并确定影响水肿随时间发展的因素。我们使用随机森林和梯度提升树这两种 ML 算法对 5 项 I/II 期研究中接受替泊替尼治疗的 612 名患者的数据进行建模,以预测水肿 AE 的发生率和严重程度。应用概率校准对水肿 AE 的可能性进行了现实的估计。最佳模型在随访数据和训练时未使用的临床研究数据上进行了测试。结果表明,在所有测试环境下,模型的性能都很高,当使用最相关的协变量重新训练模型时,F1 分数高达 0.961。使用 ML 可解释性方法确定了血清白蛋白是信息量最大的纵向协变量,而年龄越大,水肿越严重的概率越高。所开发的方法框架使我们能够使用 ML 算法分析临床安全性数据,并通过各种协变量工程方法利用纵向信息。概率校准确保了对 AE 发生可能性的准确估计,而可解释性工具可以识别导致模型预测的因素,从而支持群体和个体患者层面的解释。
Explainable machine learning prediction of edema adverse events in patients treated with tepotinib
Tepotinib is approved for the treatment of patients with non-small-cell lung cancer harboring MET exon 14 skipping alterations. While edema is the most prevalent adverse event (AE) and a known class effect of MET inhibitors including tepotinib, there is still limited understanding about the factors contributing to its occurrence. Herein, we apply machine learning (ML)-based approaches to predict the likelihood of occurrence of edema in patients undergoing tepotinib treatment, and to identify factors influencing its development over time. Data from 612 patients receiving tepotinib in five Phase I/II studies were modeled with two ML algorithms, Random Forest, and Gradient Boosting Trees, to predict edema AE incidence and severity. Probability calibration was applied to give a realistic estimation of the likelihood of edema AE. Best model was tested on follow-up data and on data from clinical studies unused while training. Results showed high performances across all the tested settings, with F1 scores up to 0.961 when retraining the model with the most relevant covariates. The use of ML explainability methods identified serum albumin as the most informative longitudinal covariate, and higher age as associated with higher probabilities of more severe edema. The developed methodological framework enables the use of ML algorithms for analyzing clinical safety data and exploiting longitudinal information through various covariate engineering approaches. Probability calibration ensures the accurate estimation of the likelihood of the AE occurrence, while explainability tools can identify factors contributing to model predictions, hence supporting population and individual patient-level interpretation.
期刊介绍:
Clinical and Translational Science (CTS), an official journal of the American Society for Clinical Pharmacology and Therapeutics, highlights original translational medicine research that helps bridge laboratory discoveries with the diagnosis and treatment of human disease. Translational medicine is a multi-faceted discipline with a focus on translational therapeutics. In a broad sense, translational medicine bridges across the discovery, development, regulation, and utilization spectrum. Research may appear as Full Articles, Brief Reports, Commentaries, Phase Forwards (clinical trials), Reviews, or Tutorials. CTS also includes invited didactic content that covers the connections between clinical pharmacology and translational medicine. Best-in-class methodologies and best practices are also welcomed as Tutorials. These additional features provide context for research articles and facilitate understanding for a wide array of individuals interested in clinical and translational science. CTS welcomes high quality, scientifically sound, original manuscripts focused on clinical pharmacology and translational science, including animal, in vitro, in silico, and clinical studies supporting the breadth of drug discovery, development, regulation and clinical use of both traditional drugs and innovative modalities.