{"title":"时间站在你这边:在差异中差异研究中汇总数据。","authors":"Summer Rak, Laura A. Hatfield, Carrie E. Fry","doi":"10.1111/1475-6773.14636","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Objective</h3>\n \n <p>To compare the performance of difference-in-differences estimators fit to data aggregated to different time scales.</p>\n </section>\n \n <section>\n \n <h3> Study Setting and Design</h3>\n \n <p>In simulations, we generated monthly observations for 50–100 units over 6 years from both a parametric model and a resampling simulation. The simulation scenarios varied panel balance, treatment timing, and true treatment effects. Our target parameters were static and dynamic average effects of treatment on the treated (ATT) estimated via linear regression (for common timing scenarios) and Callaway and Sant'Anna (2021) estimators (for staggered timing scenarios). We compared estimates from monthly, quarterly, and yearly data using bias, standard error, root mean squared error (RMSE), power, and Type I error. We also conducted a case study to illustrate the real-world impacts of these decisions.</p>\n </section>\n \n <section>\n \n <h3> Data Sources and Analytic Sample</h3>\n \n <p>We used data from a study of police retraining for the resampling simulations and case study. These data included counts of use-of-force incidents and dates of training enrollment for 8614 officers each month from 2011 to 2016.</p>\n </section>\n \n <section>\n \n <h3> Principal Findings</h3>\n \n <p>Results from the simulation varied across performance metrics, estimation methods, target estimands, and data structures. In general, the choice of time aggregation was more consequential when estimating dynamic (versus static) treatment effects, in unbalanced (versus balanced) panel data, and in the resampling simulations (where data had less autocorrelation). Although time aggregation mattered little in many scenarios, coarser aggregation was preferable in resampling simulations of staggered timing scenarios. The re-analysis of police training data was sensitive to time aggregation.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>In many scenarios, time aggregation has little impact on difference-in-differences estimators. However, when estimating dynamic effects, especially in staggered timing settings and unbalanced data, we found a tradeoff between precision and power, with finer aggregations being more powerful but less precise. In addition, estimators that use a single reference time point are more sensitive to noise in data measured at finer time scales.</p>\n </section>\n </div>","PeriodicalId":55065,"journal":{"name":"Health Services Research","volume":"60 5","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Time on Your Side: Aggregating Data in Difference-In-Differences Studies\",\"authors\":\"Summer Rak, Laura A. Hatfield, Carrie E. Fry\",\"doi\":\"10.1111/1475-6773.14636\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Objective</h3>\\n \\n <p>To compare the performance of difference-in-differences estimators fit to data aggregated to different time scales.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Study Setting and Design</h3>\\n \\n <p>In simulations, we generated monthly observations for 50–100 units over 6 years from both a parametric model and a resampling simulation. The simulation scenarios varied panel balance, treatment timing, and true treatment effects. Our target parameters were static and dynamic average effects of treatment on the treated (ATT) estimated via linear regression (for common timing scenarios) and Callaway and Sant'Anna (2021) estimators (for staggered timing scenarios). We compared estimates from monthly, quarterly, and yearly data using bias, standard error, root mean squared error (RMSE), power, and Type I error. We also conducted a case study to illustrate the real-world impacts of these decisions.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Data Sources and Analytic Sample</h3>\\n \\n <p>We used data from a study of police retraining for the resampling simulations and case study. These data included counts of use-of-force incidents and dates of training enrollment for 8614 officers each month from 2011 to 2016.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Principal Findings</h3>\\n \\n <p>Results from the simulation varied across performance metrics, estimation methods, target estimands, and data structures. In general, the choice of time aggregation was more consequential when estimating dynamic (versus static) treatment effects, in unbalanced (versus balanced) panel data, and in the resampling simulations (where data had less autocorrelation). Although time aggregation mattered little in many scenarios, coarser aggregation was preferable in resampling simulations of staggered timing scenarios. The re-analysis of police training data was sensitive to time aggregation.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>In many scenarios, time aggregation has little impact on difference-in-differences estimators. However, when estimating dynamic effects, especially in staggered timing settings and unbalanced data, we found a tradeoff between precision and power, with finer aggregations being more powerful but less precise. In addition, estimators that use a single reference time point are more sensitive to noise in data measured at finer time scales.</p>\\n </section>\\n </div>\",\"PeriodicalId\":55065,\"journal\":{\"name\":\"Health Services Research\",\"volume\":\"60 5\",\"pages\":\"\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Health Services Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/1475-6773.14636\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Services Research","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/1475-6773.14636","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
Time on Your Side: Aggregating Data in Difference-In-Differences Studies
Objective
To compare the performance of difference-in-differences estimators fit to data aggregated to different time scales.
Study Setting and Design
In simulations, we generated monthly observations for 50–100 units over 6 years from both a parametric model and a resampling simulation. The simulation scenarios varied panel balance, treatment timing, and true treatment effects. Our target parameters were static and dynamic average effects of treatment on the treated (ATT) estimated via linear regression (for common timing scenarios) and Callaway and Sant'Anna (2021) estimators (for staggered timing scenarios). We compared estimates from monthly, quarterly, and yearly data using bias, standard error, root mean squared error (RMSE), power, and Type I error. We also conducted a case study to illustrate the real-world impacts of these decisions.
Data Sources and Analytic Sample
We used data from a study of police retraining for the resampling simulations and case study. These data included counts of use-of-force incidents and dates of training enrollment for 8614 officers each month from 2011 to 2016.
Principal Findings
Results from the simulation varied across performance metrics, estimation methods, target estimands, and data structures. In general, the choice of time aggregation was more consequential when estimating dynamic (versus static) treatment effects, in unbalanced (versus balanced) panel data, and in the resampling simulations (where data had less autocorrelation). Although time aggregation mattered little in many scenarios, coarser aggregation was preferable in resampling simulations of staggered timing scenarios. The re-analysis of police training data was sensitive to time aggregation.
Conclusions
In many scenarios, time aggregation has little impact on difference-in-differences estimators. However, when estimating dynamic effects, especially in staggered timing settings and unbalanced data, we found a tradeoff between precision and power, with finer aggregations being more powerful but less precise. In addition, estimators that use a single reference time point are more sensitive to noise in data measured at finer time scales.
期刊介绍:
Health Services Research (HSR) is a peer-reviewed scholarly journal that provides researchers and public and private policymakers with the latest research findings, methods, and concepts related to the financing, organization, delivery, evaluation, and outcomes of health services. Rated as one of the top journals in the fields of health policy and services and health care administration, HSR publishes outstanding articles reporting the findings of original investigations that expand knowledge and understanding of the wide-ranging field of health care and that will help to improve the health of individuals and communities.