REFINE2: A simplified simulation tool to help epidemiologists evaluate the suitability and sensitivity of effect estimation within user-specified data.

IF 4.8 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Xiang Meng, Jonathan Y Huang
{"title":"REFINE2: A simplified simulation tool to help epidemiologists evaluate the suitability and sensitivity of effect estimation within user-specified data.","authors":"Xiang Meng, Jonathan Y Huang","doi":"10.1093/aje/kwaf195","DOIUrl":null,"url":null,"abstract":"<p><p>Epidemiologists have access to various methods to reduce bias and improve statistical efficiency in effect estimation, from standard multivariable regression to state-of-the-art doubly-robust efficient estimators paired with highly flexible, data-adaptive algorithms (\"machine learning\"). However, due to numerous assumptions and trade-offs, epidemiologists face practical difficulties in recognizing which method, if any, may be suitable for their specific data and hypotheses. Importantly, relative advantages are necessarily context-specific (data structure, algorithms, model misspecification), limiting the utility of universal guidance. Evaluating performance through real-data-based simulations is useful but out-of-reach for many epidemiologists. We present a user-friendly, offline Shiny app REFINE2 (Realistic Evaluations of Finite sample INference using Efficient Estimators) that enables analysts to input their own data and quickly compare the performance of different algorithms within their data context in estimating a prespecified average treatment effect (ATE). REFINE2 automates plasmode simulation of a plausible target ATE given observed covariates and then examines bias and confidence interval coverage (relative to this target) given user-specified models. We present an extensive case study to illustrate how REFINE2 can be used to guide analyses within epidemiologist's own data under three typical scenarios: residual confounding; spurious covariates; and mis-specified effect modification. As expected, the apparent best method differed across scenarios and are suboptimal under residual confounding. REFINE2 may help epidemiologists not only chose amongst imperfect models, but also better understand common underappreciated problems, such as finite sample bias using machine learning.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/aje/kwaf195","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0

Abstract

Epidemiologists have access to various methods to reduce bias and improve statistical efficiency in effect estimation, from standard multivariable regression to state-of-the-art doubly-robust efficient estimators paired with highly flexible, data-adaptive algorithms ("machine learning"). However, due to numerous assumptions and trade-offs, epidemiologists face practical difficulties in recognizing which method, if any, may be suitable for their specific data and hypotheses. Importantly, relative advantages are necessarily context-specific (data structure, algorithms, model misspecification), limiting the utility of universal guidance. Evaluating performance through real-data-based simulations is useful but out-of-reach for many epidemiologists. We present a user-friendly, offline Shiny app REFINE2 (Realistic Evaluations of Finite sample INference using Efficient Estimators) that enables analysts to input their own data and quickly compare the performance of different algorithms within their data context in estimating a prespecified average treatment effect (ATE). REFINE2 automates plasmode simulation of a plausible target ATE given observed covariates and then examines bias and confidence interval coverage (relative to this target) given user-specified models. We present an extensive case study to illustrate how REFINE2 can be used to guide analyses within epidemiologist's own data under three typical scenarios: residual confounding; spurious covariates; and mis-specified effect modification. As expected, the apparent best method differed across scenarios and are suboptimal under residual confounding. REFINE2 may help epidemiologists not only chose amongst imperfect models, but also better understand common underappreciated problems, such as finite sample bias using machine learning.

REFINE2:一个简化的模拟工具,帮助流行病学家在用户指定的数据中评估效果估计的适用性和敏感性。
流行病学家可以使用各种方法来减少偏差并提高效应估计的统计效率,从标准的多变量回归到最先进的双稳健高效估计器,再加上高度灵活的数据自适应算法(“机器学习”)。然而,由于许多假设和权衡,流行病学家在识别哪种方法(如果有的话)可能适合他们的特定数据和假设方面面临实际困难。重要的是,相对优势必然是特定于上下文的(数据结构、算法、模型错误规范),限制了通用指导的效用。通过基于真实数据的模拟来评估性能是有用的,但对许多流行病学家来说却遥不可及。我们提出了一个用户友好的离线Shiny应用程序REFINE2(使用高效估计器对有限样本推断进行现实评估),该应用程序使分析师能够输入自己的数据,并在其数据上下文中快速比较不同算法的性能,以估计预先指定的平均处理效果(ATE)。REFINE2在给定观察到的协变量的情况下,自动进行似是而非的目标ATE的等离子模模拟,然后在给定用户指定的模型中检查偏差和置信区间覆盖(相对于该目标)。我们提出了一个广泛的案例研究,以说明REFINE2如何在三种典型情况下用于指导流行病学家自己的数据分析:残留混淆;虚假的协变量;和错误指定的效果修改。正如预期的那样,表面最佳方法在不同的场景下是不同的,并且在残余混淆下是次优的。REFINE2不仅可以帮助流行病学家在不完美的模型中进行选择,还可以更好地理解常见的未被重视的问题,例如使用机器学习的有限样本偏差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
American journal of epidemiology
American journal of epidemiology 医学-公共卫生、环境卫生与职业卫生
CiteScore
7.40
自引率
4.00%
发文量
221
审稿时长
3-6 weeks
期刊介绍: The American Journal of Epidemiology is the oldest and one of the premier epidemiologic journals devoted to the publication of empirical research findings, opinion pieces, and methodological developments in the field of epidemiologic research. It is a peer-reviewed journal aimed at both fellow epidemiologists and those who use epidemiologic data, including public health workers and clinicians.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信