Wei-Jhih Wang, Aasthaa Bansal, Caroline Savage Bennette, Anirban Basu
{"title":"Mimicking Clinical Trials Using Real-World Data: A Novel Method and Applications.","authors":"Wei-Jhih Wang, Aasthaa Bansal, Caroline Savage Bennette, Anirban Basu","doi":"10.1177/0272989X221141381","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Simulating individual-level trial data when only summary data are available is often useful for meta-analysis, forming external control arms and calibrating trial results to real-world data (RWD). The joint distribution of baseline characteristics in a trial is usually simulated by combining its summary data with RWD's correlations. However, RWD correlations may not be a perfect proxy for the trial. A misspecified correlation structure could bias any analysis in which the outcomes generating models are nonlinear or include effect modifiers.</p><p><strong>Methods: </strong>We developed an iterative algorithm using copula and resampling, which was based on the estimated propensity score for the likelihood of enrollment in a trial given participants' characteristics. Validation was performed using Monte Carlo simulations under different scenarios in which the marginal and joint distributions of covariates differ between trial samples and RWD. Two applications were illustrated using an actual trial and the Surveillance, Epidemiology, and End Results-Medicare data. We calculated the standardized mean difference (SMD) to assess the generalizability of the trial and explored the feasibility of generating an external control by applying a parametric Weibull model trained in RWD to predict survival in the simulated trial cohort.</p><p><strong>Results: </strong>Across all scenarios, approximated correlations derived from the algorithm were closer to the true correlations than the RWD's correlations. The algorithm also successfully reproduced the joint distribution of characteristics for the actual trial. A similar SMD was observed using simulated data and individual-level trial data. The 95% confidence intervals were overlapped between adjusted survival estimates from the simulated trial and actual trial Kaplan-Meier estimates.</p><p><strong>Conclusions: </strong>The algorithm could be a feasible way to simulate individual-level data when only summary data are available. Further research is needed to validate our approach with larger sample sizes.</p><p><strong>Highlights: </strong>The correlation structure is crucial to building the joint distribution of patient characteristics, and a misspecified correlation structure could potentially influence predicted outcomes.An iterative algorithm was developed to approximate a trial's correlation structure using published summary trial data and real-world data.The algorithm could be a feasible way to simulate individual-level trial data when only trial summary data are available.</p>","PeriodicalId":49839,"journal":{"name":"Medical Decision Making","volume":"43 3","pages":"275-287"},"PeriodicalIF":3.1000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/0272989X221141381","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Simulating individual-level trial data when only summary data are available is often useful for meta-analysis, forming external control arms and calibrating trial results to real-world data (RWD). The joint distribution of baseline characteristics in a trial is usually simulated by combining its summary data with RWD's correlations. However, RWD correlations may not be a perfect proxy for the trial. A misspecified correlation structure could bias any analysis in which the outcomes generating models are nonlinear or include effect modifiers.
Methods: We developed an iterative algorithm using copula and resampling, which was based on the estimated propensity score for the likelihood of enrollment in a trial given participants' characteristics. Validation was performed using Monte Carlo simulations under different scenarios in which the marginal and joint distributions of covariates differ between trial samples and RWD. Two applications were illustrated using an actual trial and the Surveillance, Epidemiology, and End Results-Medicare data. We calculated the standardized mean difference (SMD) to assess the generalizability of the trial and explored the feasibility of generating an external control by applying a parametric Weibull model trained in RWD to predict survival in the simulated trial cohort.
Results: Across all scenarios, approximated correlations derived from the algorithm were closer to the true correlations than the RWD's correlations. The algorithm also successfully reproduced the joint distribution of characteristics for the actual trial. A similar SMD was observed using simulated data and individual-level trial data. The 95% confidence intervals were overlapped between adjusted survival estimates from the simulated trial and actual trial Kaplan-Meier estimates.
Conclusions: The algorithm could be a feasible way to simulate individual-level data when only summary data are available. Further research is needed to validate our approach with larger sample sizes.
Highlights: The correlation structure is crucial to building the joint distribution of patient characteristics, and a misspecified correlation structure could potentially influence predicted outcomes.An iterative algorithm was developed to approximate a trial's correlation structure using published summary trial data and real-world data.The algorithm could be a feasible way to simulate individual-level trial data when only trial summary data are available.
期刊介绍:
Medical Decision Making offers rigorous and systematic approaches to decision making that are designed to improve the health and clinical care of individuals and to assist with health care policy development. Using the fundamentals of decision analysis and theory, economic evaluation, and evidence based quality assessment, Medical Decision Making presents both theoretical and practical statistical and modeling techniques and methods from a variety of disciplines.