Tiansheng Wang, Virginia Pate, Richard Wyss, John B Buse, Michael R Kosorok, Til Stürmer
{"title":"High-dimensional Iterative Causal Forest (hdiCF) for Subgroup Identification Using Health Care Claims Data.","authors":"Tiansheng Wang, Virginia Pate, Richard Wyss, John B Buse, Michael R Kosorok, Til Stürmer","doi":"10.1093/aje/kwaf127","DOIUrl":null,"url":null,"abstract":"<p><p>We tested a novel high-dimensional approach (using 1 ordinal variable per code with up to four levels: zero, occurred once, sporadically, or frequent) against the standard high-dimensional propensity score (hdPS) method (up to 3 binary variables per code) for detecting heterogeneous treatment effects (HTE). Using the iterative causal forest (iCF) subgrouping algorithm, we analyzed a new-user cohort of 8,075 sodium-glucose cotransporter-2 inhibitors and 7,313 glucagon-like peptide-1 receptor agonists from a 20% random Medicare sample (2015-2019) with ≥1-year parts A/B/D enrollment and without severe renal disease. We extracted the top 200 prevalent codes across diagnoses, procedures, and prescriptions during the 1-year baseline. Subgroup-specific conditional average treatment effects (CATEs) were assessed for 2-year risk differences (aRD) in hospitalized heart failure using inverse-probability treatment weighting. The overall population exhibited an aRD of -0.4% (95% CI -1.1%, 0.2%). Our high-dimensional setting identified patients with ≥2 loop diuretic prescriptions (aRD: -2.6%, 95% CI: -5.0%, -0.2%) as the subgroup with the largest CATE. In contrast, the high-dimensional setting from hdPS identified patients with chronic kidney disease (aRD: -1.7%, 95% CI: -3.6%, 0.2%). Across various sensitivity analyses, our high-dimensional approach more accurately identified expected subgroups with HTE that aligns with prior clinical evidence.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/aje/kwaf127","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
We tested a novel high-dimensional approach (using 1 ordinal variable per code with up to four levels: zero, occurred once, sporadically, or frequent) against the standard high-dimensional propensity score (hdPS) method (up to 3 binary variables per code) for detecting heterogeneous treatment effects (HTE). Using the iterative causal forest (iCF) subgrouping algorithm, we analyzed a new-user cohort of 8,075 sodium-glucose cotransporter-2 inhibitors and 7,313 glucagon-like peptide-1 receptor agonists from a 20% random Medicare sample (2015-2019) with ≥1-year parts A/B/D enrollment and without severe renal disease. We extracted the top 200 prevalent codes across diagnoses, procedures, and prescriptions during the 1-year baseline. Subgroup-specific conditional average treatment effects (CATEs) were assessed for 2-year risk differences (aRD) in hospitalized heart failure using inverse-probability treatment weighting. The overall population exhibited an aRD of -0.4% (95% CI -1.1%, 0.2%). Our high-dimensional setting identified patients with ≥2 loop diuretic prescriptions (aRD: -2.6%, 95% CI: -5.0%, -0.2%) as the subgroup with the largest CATE. In contrast, the high-dimensional setting from hdPS identified patients with chronic kidney disease (aRD: -1.7%, 95% CI: -3.6%, 0.2%). Across various sensitivity analyses, our high-dimensional approach more accurately identified expected subgroups with HTE that aligns with prior clinical evidence.
期刊介绍:
The American Journal of Epidemiology is the oldest and one of the premier epidemiologic journals devoted to the publication of empirical research findings, opinion pieces, and methodological developments in the field of epidemiologic research.
It is a peer-reviewed journal aimed at both fellow epidemiologists and those who use epidemiologic data, including public health workers and clinicians.