{"title":"ML Framework for Aggregating Individual-Level and averaged clinical data.","authors":"A Vaquero Castro, M Simeoni, E Grisan","doi":"10.1109/EMBC58623.2025.11254899","DOIUrl":null,"url":null,"abstract":"<p><p>Pharmacokinetics-Pharmacodynamics (PK/PD) data analysis is a cornerstone of both drug development and efficacy and safety studies. However, individual-level PK/PD data are difficult to obtain, expensive, and scattered throughout different clinical trials, for which usually only aggregated statistics are publicly reported. Meta-Analysis (MA) approaches from simple MA to the more advanced multi-variate meta-regression, and Model-Based MA (MBMA) are among the available tools to interpret average-level data. Ideally, the availability of individual patient data (IPD) would allow methods based on parametric pharmacological models, such as MBMA, to provide a better characterization of the relationships between covariates and PK/PD parameters. We propose to leverage a generative-AI approach to regenerate the IPD data of cohorts with only population-level statistics, by exploiting the availability of a small set of IPD.To test the methodology, we simulate a scenario with different datasets related to different clinical studies. The generative model is trained using IPD from a single study and can then generate IPD data from the population statistics of all others. We show that our algorithm can successfully learn and apply the original relationships of the IPD study to regenerate information lost by averaging data for external reporting purposes. In order to validate and test the analysis, we carried out performance tests showing a good agreement between model-simulated ground truth data and ML-generated data.</p>","PeriodicalId":72237,"journal":{"name":"Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference","volume":"2025 ","pages":"1-4"},"PeriodicalIF":0.0000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EMBC58623.2025.11254899","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Pharmacokinetics-Pharmacodynamics (PK/PD) data analysis is a cornerstone of both drug development and efficacy and safety studies. However, individual-level PK/PD data are difficult to obtain, expensive, and scattered throughout different clinical trials, for which usually only aggregated statistics are publicly reported. Meta-Analysis (MA) approaches from simple MA to the more advanced multi-variate meta-regression, and Model-Based MA (MBMA) are among the available tools to interpret average-level data. Ideally, the availability of individual patient data (IPD) would allow methods based on parametric pharmacological models, such as MBMA, to provide a better characterization of the relationships between covariates and PK/PD parameters. We propose to leverage a generative-AI approach to regenerate the IPD data of cohorts with only population-level statistics, by exploiting the availability of a small set of IPD.To test the methodology, we simulate a scenario with different datasets related to different clinical studies. The generative model is trained using IPD from a single study and can then generate IPD data from the population statistics of all others. We show that our algorithm can successfully learn and apply the original relationships of the IPD study to regenerate information lost by averaging data for external reporting purposes. In order to validate and test the analysis, we carried out performance tests showing a good agreement between model-simulated ground truth data and ML-generated data.