Maricela Cruz, Susan M Shortreed, Gregory E Simon, Yates Coley
{"title":"用差中差法评估基于风险预测的干预措施的临床实施。","authors":"Maricela Cruz, Susan M Shortreed, Gregory E Simon, Yates Coley","doi":"10.1111/1475-6773.70015","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To compare alternative Difference-in-Differences (DID) methods for evaluating the effect of risk-stratified interventions, or interventions targeting at-risk groups, on binary outcomes.</p><p><strong>Study setting and design: </strong>In simulations, we compared operating characteristics of recycled prediction estimators for common average treatment effect on the treated (ATT) estimands across three DID models: the traditional two groups and two periods model, a risk score adjusted model, and a model adjusting for risk score and its interactions with risk group and period. We compared DID ATT estimates to randomized evaluation estimates of a risk-stratified intervention implemented at Kaiser Permanente Washington (KPWA), delivering additional text-message reminders to reduce missed clinic visits.</p><p><strong>Data sources and analytic sample: </strong>Our study included 588,503 KPWA visits, with 285,814 (49%) visits pre-evaluation (05/01/2018-10/30/2018) and 302,689 (51%) visits during the evaluation (02/01/2019-09/30/2019). Pre-evaluation, 120,350 visits were classified as high-risk. During the evaluation, 125,076 visits were labeled as high-risk, with 62,557 (50%) randomized to the intervention. We generated data in simulations based on this setting.</p><p><strong>Principal findings: </strong>In simulations, the traditional DID and risk score adjusted models had smaller bias and standard errors, and better coverage probabilities. DID estimates closest to randomized evaluation estimates (-0.007, 95% CI [-0.010, -0.004]) were from the traditional DID model assuming the identity link (-0.008, 95% CI [-0.011, -0.005]) or the risk adjusted model with any link (-0.006, 95% CI [-0.008, -0.003] identity; -0.007, 95% CI [-0.011, -0.003] logit; -0.007, 95% CI [-0.012, -0.003] log) for the ATT on the absolute difference scale (usual DID ATT estimand), and the risk score adjusted model with log or logit links for all other estimands.</p><p><strong>Conclusions: </strong>Compared with randomized evaluation results, the traditional DID model is appropriate for the ATT on the absolute difference scale, while the risk score adjusted model with log or logit links is appropriate for all ATT estimands considered.</p>","PeriodicalId":55065,"journal":{"name":"Health Services Research","volume":" ","pages":"e70015"},"PeriodicalIF":3.2000,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating Clinical Implementation of Risk Prediction Based Interventions Using Difference-In-Differences.\",\"authors\":\"Maricela Cruz, Susan M Shortreed, Gregory E Simon, Yates Coley\",\"doi\":\"10.1111/1475-6773.70015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>To compare alternative Difference-in-Differences (DID) methods for evaluating the effect of risk-stratified interventions, or interventions targeting at-risk groups, on binary outcomes.</p><p><strong>Study setting and design: </strong>In simulations, we compared operating characteristics of recycled prediction estimators for common average treatment effect on the treated (ATT) estimands across three DID models: the traditional two groups and two periods model, a risk score adjusted model, and a model adjusting for risk score and its interactions with risk group and period. We compared DID ATT estimates to randomized evaluation estimates of a risk-stratified intervention implemented at Kaiser Permanente Washington (KPWA), delivering additional text-message reminders to reduce missed clinic visits.</p><p><strong>Data sources and analytic sample: </strong>Our study included 588,503 KPWA visits, with 285,814 (49%) visits pre-evaluation (05/01/2018-10/30/2018) and 302,689 (51%) visits during the evaluation (02/01/2019-09/30/2019). Pre-evaluation, 120,350 visits were classified as high-risk. During the evaluation, 125,076 visits were labeled as high-risk, with 62,557 (50%) randomized to the intervention. We generated data in simulations based on this setting.</p><p><strong>Principal findings: </strong>In simulations, the traditional DID and risk score adjusted models had smaller bias and standard errors, and better coverage probabilities. DID estimates closest to randomized evaluation estimates (-0.007, 95% CI [-0.010, -0.004]) were from the traditional DID model assuming the identity link (-0.008, 95% CI [-0.011, -0.005]) or the risk adjusted model with any link (-0.006, 95% CI [-0.008, -0.003] identity; -0.007, 95% CI [-0.011, -0.003] logit; -0.007, 95% CI [-0.012, -0.003] log) for the ATT on the absolute difference scale (usual DID ATT estimand), and the risk score adjusted model with log or logit links for all other estimands.</p><p><strong>Conclusions: </strong>Compared with randomized evaluation results, the traditional DID model is appropriate for the ATT on the absolute difference scale, while the risk score adjusted model with log or logit links is appropriate for all ATT estimands considered.</p>\",\"PeriodicalId\":55065,\"journal\":{\"name\":\"Health Services Research\",\"volume\":\" \",\"pages\":\"e70015\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Health Services Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1111/1475-6773.70015\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Services Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/1475-6773.70015","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
摘要
目的:比较不同的差分法(DID)来评估风险分层干预或针对高危人群的干预对二元结果的影响。研究设置和设计:在模拟中,我们比较了三种DID模型(传统的两组两期模型、风险评分调整模型和风险评分及其与风险组和时期的相互作用调整模型)中回收预测估计器的共同平均处理效果(ATT)估计的运行特征。我们比较了DID ATT估计值与Kaiser Permanente Washington (KPWA)实施的风险分层干预的随机评估估计值,提供额外的短信提醒以减少错过的诊所就诊。数据来源及分析样本:我们的研究包括588,503次KPWA访问,其中评估前(2018/01/05 - 2018/10/30)访问285,814次(49%),评估期间(2019年1月02 - 2019年9月30日)访问302,689次(51%)。预评估中,120,350次就诊被归为高风险。在评估期间,125,076次就诊被标记为高风险,其中62,557次(50%)随机分配到干预组。我们基于这个设置在模拟中生成数据。主要发现:在模拟中,传统的DID和风险评分调整模型具有较小的偏差和标准误差,并且具有更好的覆盖概率。最接近随机评价估计的DID估计(-0.007,95% CI[-0.010, -0.004])来自传统的DID模型,假设存在身份关联(-0.008,95% CI[-0.011, -0.005])或具有任何联系的风险调整模型(-0.006,95% CI [-0.008, -0.003]);-0.007, 95% CI [-0.011, -0.003] logit;-0.007, 95% CI [-0.012, -0.003] log)用于绝对差量表上的ATT(通常的DID ATT估计),风险评分调整模型与log或logit链接用于所有其他估计。结论:与随机评价结果相比,传统DID模型适用于绝对差量表上的ATT,而带有log或logit链接的风险评分调整模型适用于所有考虑的ATT估计值。
Evaluating Clinical Implementation of Risk Prediction Based Interventions Using Difference-In-Differences.
Objective: To compare alternative Difference-in-Differences (DID) methods for evaluating the effect of risk-stratified interventions, or interventions targeting at-risk groups, on binary outcomes.
Study setting and design: In simulations, we compared operating characteristics of recycled prediction estimators for common average treatment effect on the treated (ATT) estimands across three DID models: the traditional two groups and two periods model, a risk score adjusted model, and a model adjusting for risk score and its interactions with risk group and period. We compared DID ATT estimates to randomized evaluation estimates of a risk-stratified intervention implemented at Kaiser Permanente Washington (KPWA), delivering additional text-message reminders to reduce missed clinic visits.
Data sources and analytic sample: Our study included 588,503 KPWA visits, with 285,814 (49%) visits pre-evaluation (05/01/2018-10/30/2018) and 302,689 (51%) visits during the evaluation (02/01/2019-09/30/2019). Pre-evaluation, 120,350 visits were classified as high-risk. During the evaluation, 125,076 visits were labeled as high-risk, with 62,557 (50%) randomized to the intervention. We generated data in simulations based on this setting.
Principal findings: In simulations, the traditional DID and risk score adjusted models had smaller bias and standard errors, and better coverage probabilities. DID estimates closest to randomized evaluation estimates (-0.007, 95% CI [-0.010, -0.004]) were from the traditional DID model assuming the identity link (-0.008, 95% CI [-0.011, -0.005]) or the risk adjusted model with any link (-0.006, 95% CI [-0.008, -0.003] identity; -0.007, 95% CI [-0.011, -0.003] logit; -0.007, 95% CI [-0.012, -0.003] log) for the ATT on the absolute difference scale (usual DID ATT estimand), and the risk score adjusted model with log or logit links for all other estimands.
Conclusions: Compared with randomized evaluation results, the traditional DID model is appropriate for the ATT on the absolute difference scale, while the risk score adjusted model with log or logit links is appropriate for all ATT estimands considered.
期刊介绍:
Health Services Research (HSR) is a peer-reviewed scholarly journal that provides researchers and public and private policymakers with the latest research findings, methods, and concepts related to the financing, organization, delivery, evaluation, and outcomes of health services. Rated as one of the top journals in the fields of health policy and services and health care administration, HSR publishes outstanding articles reporting the findings of original investigations that expand knowledge and understanding of the wide-ranging field of health care and that will help to improve the health of individuals and communities.