Federico Amato, Rainer Strotmann, Roberto Castello, Rolf Bruns, Vishal Ghori, Andreas Johne, Karin Berghoff, Karthik Venkatakrishnan, Nadia Terranova
{"title":"Explainable machine learning prediction of edema adverse events in patients treated with tepotinib","authors":"Federico Amato, Rainer Strotmann, Roberto Castello, Rolf Bruns, Vishal Ghori, Andreas Johne, Karin Berghoff, Karthik Venkatakrishnan, Nadia Terranova","doi":"10.1111/cts.70010","DOIUrl":null,"url":null,"abstract":"<p>Tepotinib is approved for the treatment of patients with non-small-cell lung cancer harboring <i>MET</i> exon 14 skipping alterations. While edema is the most prevalent adverse event (AE) and a known class effect of MET inhibitors including tepotinib, there is still limited understanding about the factors contributing to its occurrence. Herein, we apply machine learning (ML)-based approaches to predict the likelihood of occurrence of edema in patients undergoing tepotinib treatment, and to identify factors influencing its development over time. Data from 612 patients receiving tepotinib in five Phase I/II studies were modeled with two ML algorithms, Random Forest, and Gradient Boosting Trees, to predict edema AE incidence and severity. Probability calibration was applied to give a realistic estimation of the likelihood of edema AE. Best model was tested on follow-up data and on data from clinical studies unused while training. Results showed high performances across all the tested settings, with F1 scores up to 0.961 when retraining the model with the most relevant covariates. The use of ML explainability methods identified serum albumin as the most informative longitudinal covariate, and higher age as associated with higher probabilities of more severe edema. The developed methodological framework enables the use of ML algorithms for analyzing clinical safety data and exploiting longitudinal information through various covariate engineering approaches. Probability calibration ensures the accurate estimation of the likelihood of the AE occurrence, while explainability tools can identify factors contributing to model predictions, hence supporting population and individual patient-level interpretation.</p>","PeriodicalId":50610,"journal":{"name":"Cts-Clinical and Translational Science","volume":"17 9","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cts.70010","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cts-Clinical and Translational Science","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cts.70010","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Tepotinib is approved for the treatment of patients with non-small-cell lung cancer harboring MET exon 14 skipping alterations. While edema is the most prevalent adverse event (AE) and a known class effect of MET inhibitors including tepotinib, there is still limited understanding about the factors contributing to its occurrence. Herein, we apply machine learning (ML)-based approaches to predict the likelihood of occurrence of edema in patients undergoing tepotinib treatment, and to identify factors influencing its development over time. Data from 612 patients receiving tepotinib in five Phase I/II studies were modeled with two ML algorithms, Random Forest, and Gradient Boosting Trees, to predict edema AE incidence and severity. Probability calibration was applied to give a realistic estimation of the likelihood of edema AE. Best model was tested on follow-up data and on data from clinical studies unused while training. Results showed high performances across all the tested settings, with F1 scores up to 0.961 when retraining the model with the most relevant covariates. The use of ML explainability methods identified serum albumin as the most informative longitudinal covariate, and higher age as associated with higher probabilities of more severe edema. The developed methodological framework enables the use of ML algorithms for analyzing clinical safety data and exploiting longitudinal information through various covariate engineering approaches. Probability calibration ensures the accurate estimation of the likelihood of the AE occurrence, while explainability tools can identify factors contributing to model predictions, hence supporting population and individual patient-level interpretation.
期刊介绍:
Clinical and Translational Science (CTS), an official journal of the American Society for Clinical Pharmacology and Therapeutics, highlights original translational medicine research that helps bridge laboratory discoveries with the diagnosis and treatment of human disease. Translational medicine is a multi-faceted discipline with a focus on translational therapeutics. In a broad sense, translational medicine bridges across the discovery, development, regulation, and utilization spectrum. Research may appear as Full Articles, Brief Reports, Commentaries, Phase Forwards (clinical trials), Reviews, or Tutorials. CTS also includes invited didactic content that covers the connections between clinical pharmacology and translational medicine. Best-in-class methodologies and best practices are also welcomed as Tutorials. These additional features provide context for research articles and facilitate understanding for a wide array of individuals interested in clinical and translational science. CTS welcomes high quality, scientifically sound, original manuscripts focused on clinical pharmacology and translational science, including animal, in vitro, in silico, and clinical studies supporting the breadth of drug discovery, development, regulation and clinical use of both traditional drugs and innovative modalities.