Time on Your Side: Aggregating Data in Difference-In-Differences Studies

IF 3.2 2区医学 Q2 HEALTH CARE SCIENCES & SERVICES

Health Services Research Pub Date : 2025-05-27 DOI:10.1111/1475-6773.14636

Summer Rak, Laura A. Hatfield, Carrie E. Fry

{"title":"Time on Your Side: Aggregating Data in Difference-In-Differences Studies","authors":"Summer Rak, Laura A. Hatfield, Carrie E. Fry","doi":"10.1111/1475-6773.14636","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Objective</h3>\n \n <p>To compare the performance of difference-in-differences estimators fit to data aggregated to different time scales.</p>\n </section>\n \n <section>\n \n <h3> Study Setting and Design</h3>\n \n <p>In simulations, we generated monthly observations for 50–100 units over 6 years from both a parametric model and a resampling simulation. The simulation scenarios varied panel balance, treatment timing, and true treatment effects. Our target parameters were static and dynamic average effects of treatment on the treated (ATT) estimated via linear regression (for common timing scenarios) and Callaway and Sant'Anna (2021) estimators (for staggered timing scenarios). We compared estimates from monthly, quarterly, and yearly data using bias, standard error, root mean squared error (RMSE), power, and Type I error. We also conducted a case study to illustrate the real-world impacts of these decisions.</p>\n </section>\n \n <section>\n \n <h3> Data Sources and Analytic Sample</h3>\n \n <p>We used data from a study of police retraining for the resampling simulations and case study. These data included counts of use-of-force incidents and dates of training enrollment for 8614 officers each month from 2011 to 2016.</p>\n </section>\n \n <section>\n \n <h3> Principal Findings</h3>\n \n <p>Results from the simulation varied across performance metrics, estimation methods, target estimands, and data structures. In general, the choice of time aggregation was more consequential when estimating dynamic (versus static) treatment effects, in unbalanced (versus balanced) panel data, and in the resampling simulations (where data had less autocorrelation). Although time aggregation mattered little in many scenarios, coarser aggregation was preferable in resampling simulations of staggered timing scenarios. The re-analysis of police training data was sensitive to time aggregation.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>In many scenarios, time aggregation has little impact on difference-in-differences estimators. However, when estimating dynamic effects, especially in staggered timing settings and unbalanced data, we found a tradeoff between precision and power, with finer aggregations being more powerful but less precise. In addition, estimators that use a single reference time point are more sensitive to noise in data measured at finer time scales.</p>\n </section>\n </div>","PeriodicalId":55065,"journal":{"name":"Health Services Research","volume":"60 5","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Services Research","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/1475-6773.14636","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Objective

To compare the performance of difference-in-differences estimators fit to data aggregated to different time scales.

Study Setting and Design

In simulations, we generated monthly observations for 50–100 units over 6 years from both a parametric model and a resampling simulation. The simulation scenarios varied panel balance, treatment timing, and true treatment effects. Our target parameters were static and dynamic average effects of treatment on the treated (ATT) estimated via linear regression (for common timing scenarios) and Callaway and Sant'Anna (2021) estimators (for staggered timing scenarios). We compared estimates from monthly, quarterly, and yearly data using bias, standard error, root mean squared error (RMSE), power, and Type I error. We also conducted a case study to illustrate the real-world impacts of these decisions.

Data Sources and Analytic Sample

We used data from a study of police retraining for the resampling simulations and case study. These data included counts of use-of-force incidents and dates of training enrollment for 8614 officers each month from 2011 to 2016.

Principal Findings

Results from the simulation varied across performance metrics, estimation methods, target estimands, and data structures. In general, the choice of time aggregation was more consequential when estimating dynamic (versus static) treatment effects, in unbalanced (versus balanced) panel data, and in the resampling simulations (where data had less autocorrelation). Although time aggregation mattered little in many scenarios, coarser aggregation was preferable in resampling simulations of staggered timing scenarios. The re-analysis of police training data was sensitive to time aggregation.

Conclusions

In many scenarios, time aggregation has little impact on difference-in-differences estimators. However, when estimating dynamic effects, especially in staggered timing settings and unbalanced data, we found a tradeoff between precision and power, with finer aggregations being more powerful but less precise. In addition, estimators that use a single reference time point are more sensitive to noise in data measured at finer time scales.

Abstract Image

查看原文本刊更多论文

时间站在你这边：在差异中差异研究中汇总数据。

目的：比较差分中差估计器对不同时间尺度数据的拟合性能。研究设置和设计：在模拟中，我们从参数模型和重采样模拟中生成了6年内50-100个单位的每月观测值。模拟场景改变了面板平衡、治疗时机和真实的治疗效果。我们的目标参数是通过线性回归（用于常见时间情景）和Callaway和Sant'Anna（2021）估计（用于交错时间情景）估计的处理对被处理（ATT）的静态和动态平均影响。我们使用偏差、标准误差、均方根误差（RMSE）、功率和I型误差比较了月度、季度和年度数据的估计值。我们还进行了一个案例研究，以说明这些决策对现实世界的影响。数据来源和分析样本：我们使用来自警察再培训研究的数据进行重新抽样模拟和案例研究。这些数据包括2011年至2016年期间8614名警察每月使用武力事件的计数和培训入学日期。主要发现：模拟的结果因性能度量、评估方法、目标评估和数据结构而异。一般来说，在估计动态（相对于静态）处理效果、不平衡（相对于平衡）面板数据和重采样模拟（其中数据具有较少的自相关性）时，时间聚合的选择更为重要。虽然时间聚集在许多场景中影响不大，但在交错时序场景的重采样模拟中，更粗的聚集是可取的。警察培训数据的再分析对时间聚合比较敏感。结论：在许多情况下，时间聚合对差中差估计器的影响很小。然而，当估计动态效果时，特别是在交错计时设置和不平衡数据中，我们发现了精度和功率之间的权衡，更精细的聚合更强大，但精度更低。此外，使用单个参考时间点的估计器对在更细的时间尺度上测量的数据中的噪声更敏感。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Health Services Research 医学-卫生保健

CiteScore

4.80

自引率

5.90%

发文量

193

审稿时长

4-8 weeks

期刊介绍： Health Services Research (HSR) is a peer-reviewed scholarly journal that provides researchers and public and private policymakers with the latest research findings, methods, and concepts related to the financing, organization, delivery, evaluation, and outcomes of health services. Rated as one of the top journals in the fields of health policy and services and health care administration, HSR publishes outstanding articles reporting the findings of original investigations that expand knowledge and understanding of the wide-ranging field of health care and that will help to improve the health of individuals and communities.