Targeted learning with an undersmoothed LASSO propensity score model for large-scale covariate adjustment in health-care database studies.

IF 5 2区医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH

American journal of epidemiology Pub Date : 2024-11-04 DOI:10.1093/aje/kwae023

Richard Wyss, Mark van der Laan, Susan Gruber, Xu Shi, Hana Lee, Sarah K Dutcher, Jennifer C Nelson, Sengwee Toh, Massimiliano Russo, Shirley V Wang, Rishi J Desai, Kueiyu Joshua Lin

{"title":"Targeted learning with an undersmoothed LASSO propensity score model for large-scale covariate adjustment in health-care database studies.","authors":"Richard Wyss, Mark van der Laan, Susan Gruber, Xu Shi, Hana Lee, Sarah K Dutcher, Jennifer C Nelson, Sengwee Toh, Massimiliano Russo, Shirley V Wang, Rishi J Desai, Kueiyu Joshua Lin","doi":"10.1093/aje/kwae023","DOIUrl":null,"url":null,"abstract":"<p><p>Least absolute shrinkage and selection operator (LASSO) regression is widely used for large-scale propensity score (PS) estimation in health-care database studies. In these settings, previous work has shown that undersmoothing (overfitting) LASSO PS models can improve confounding control, but it can also cause problems of nonoverlap in covariate distributions. It remains unclear how to select the degree of undersmoothing when fitting large-scale LASSO PS models to improve confounding control while avoiding issues that can result from reduced covariate overlap. Here, we used simulations to evaluate the performance of using collaborative-controlled targeted learning to data-adaptively select the degree of undersmoothing when fitting large-scale PS models within both singly and doubly robust frameworks to reduce bias in causal estimators. Simulations showed that collaborative learning can data-adaptively select the degree of undersmoothing to reduce bias in estimated treatment effects. Results further showed that when fitting undersmoothed LASSO PS models, the use of cross-fitting was important for avoiding nonoverlap in covariate distributions and reducing bias in causal estimates.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11538566/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/aje/kwae023","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}

引用次数: 0

Abstract

Least absolute shrinkage and selection operator (LASSO) regression is widely used for large-scale propensity score (PS) estimation in health-care database studies. In these settings, previous work has shown that undersmoothing (overfitting) LASSO PS models can improve confounding control, but it can also cause problems of nonoverlap in covariate distributions. It remains unclear how to select the degree of undersmoothing when fitting large-scale LASSO PS models to improve confounding control while avoiding issues that can result from reduced covariate overlap. Here, we used simulations to evaluate the performance of using collaborative-controlled targeted learning to data-adaptively select the degree of undersmoothing when fitting large-scale PS models within both singly and doubly robust frameworks to reduce bias in causal estimators. Simulations showed that collaborative learning can data-adaptively select the degree of undersmoothing to reduce bias in estimated treatment effects. Results further showed that when fitting undersmoothed LASSO PS models, the use of cross-fitting was important for avoiding nonoverlap in covariate distributions and reducing bias in causal estimates.

查看原文本刊更多论文

在医疗保健数据库研究中使用下平滑拉索倾向得分模型进行有针对性的学习，以进行大规模协方差调整。

在医疗数据库研究中，Lasso 回归被广泛用于大规模倾向得分（PS）估计。以往的研究表明，在这些情况下，下平滑（过度拟合）Lasso PS 模型可以改善混杂控制，但也会造成协变量分布不重叠的问题。目前仍不清楚在拟合大规模 Lasso PS 模型时如何选择下平滑的程度，以改善混杂控制，同时避免协变量重叠减少可能导致的问题。在此，我们利用模拟评估了在单稳健和双稳健框架内拟合大规模 PS 模型时，使用协作控制的目标学习来数据适应性地选择下平滑程度，以减少因果估计器偏差的性能。模拟结果表明，协作学习可以根据数据自适应地选择下平滑程度，从而减少估计治疗效果的偏差。结果进一步表明，在拟合下平滑 Lasso PS 模型时，使用交叉拟合对于避免协变量分布的非重叠和减少因果估计的偏差非常重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

American journal of epidemiology 医学-公共卫生、环境卫生与职业卫生

CiteScore

7.40

自引率

4.00%

发文量

221

审稿时长

3-6 weeks

期刊介绍： The American Journal of Epidemiology is the oldest and one of the premier epidemiologic journals devoted to the publication of empirical research findings, opinion pieces, and methodological developments in the field of epidemiologic research. It is a peer-reviewed journal aimed at both fellow epidemiologists and those who use epidemiologic data, including public health workers and clinicians.