Stephen Privitera, Hooman Sedghamiz, Alexander Hartenstein, Tatsiana Vaitsiakhovich, Frank Kleinjung
{"title":"An evolutionary algorithm for the direct optimization of covariate balance between nonrandomized populations.","authors":"Stephen Privitera, Hooman Sedghamiz, Alexander Hartenstein, Tatsiana Vaitsiakhovich, Frank Kleinjung","doi":"10.1002/pst.2352","DOIUrl":null,"url":null,"abstract":"<p><p>Matching reduces confounding bias in comparing the outcomes of nonrandomized patient populations by removing systematic differences between them. Under very basic assumptions, propensity score (PS) matching can be shown to eliminate bias entirely in estimating the average treatment effect on the treated. In practice, misspecification of the PS model leads to deviations from theory and matching quality is ultimately judged by the observed post-matching balance in baseline covariates. Since covariate balance is the ultimate arbiter of successful matching, we argue for an approach to matching in which the success criterion is explicitly specified and describe an evolutionary algorithm to directly optimize an arbitrary metric of covariate balance. We demonstrate the performance of the proposed method using a simulated dataset of 275,000 patients and 10 matching covariates. We further apply the method to match 250 patients from a recently completed clinical trial to a pool of more than 160,000 patients identified from electronic health records on 101 covariates. In all cases, we find that the proposed method outperforms PS matching as measured by the specified balance criterion. We additionally find that the evolutionary approach can perform comparably to another popular direct optimization technique based on linear integer programming, while having the additional advantage of supporting arbitrary balance metrics. We demonstrate how the chosen balance metric impacts the statistical properties of the resulting matched populations, emphasizing the potential impact of using nonlinear balance functions in constructing an external control arm. We release our implementation of the considered algorithms in Python.</p>","PeriodicalId":19934,"journal":{"name":"Pharmaceutical Statistics","volume":" ","pages":"288-307"},"PeriodicalIF":1.3000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pharmaceutical Statistics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/pst.2352","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/12/18 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
引用次数: 0
Abstract
Matching reduces confounding bias in comparing the outcomes of nonrandomized patient populations by removing systematic differences between them. Under very basic assumptions, propensity score (PS) matching can be shown to eliminate bias entirely in estimating the average treatment effect on the treated. In practice, misspecification of the PS model leads to deviations from theory and matching quality is ultimately judged by the observed post-matching balance in baseline covariates. Since covariate balance is the ultimate arbiter of successful matching, we argue for an approach to matching in which the success criterion is explicitly specified and describe an evolutionary algorithm to directly optimize an arbitrary metric of covariate balance. We demonstrate the performance of the proposed method using a simulated dataset of 275,000 patients and 10 matching covariates. We further apply the method to match 250 patients from a recently completed clinical trial to a pool of more than 160,000 patients identified from electronic health records on 101 covariates. In all cases, we find that the proposed method outperforms PS matching as measured by the specified balance criterion. We additionally find that the evolutionary approach can perform comparably to another popular direct optimization technique based on linear integer programming, while having the additional advantage of supporting arbitrary balance metrics. We demonstrate how the chosen balance metric impacts the statistical properties of the resulting matched populations, emphasizing the potential impact of using nonlinear balance functions in constructing an external control arm. We release our implementation of the considered algorithms in Python.
期刊介绍:
Pharmaceutical Statistics is an industry-led initiative, tackling real problems in statistical applications. The Journal publishes papers that share experiences in the practical application of statistics within the pharmaceutical industry. It covers all aspects of pharmaceutical statistical applications from discovery, through pre-clinical development, clinical development, post-marketing surveillance, consumer health, production, epidemiology, and health economics.
The Journal is both international and multidisciplinary. It includes high quality practical papers, case studies and review papers.