何时调整聚类的标准误差?

PSN: Econometrics Pub Date : 2017-10-09 DOI:10.3386/W24003

Alberto Abadie, S. Athey, G. Imbens, J. Wooldridge

{"title":"何时调整聚类的标准误差?","authors":"Alberto Abadie, S. Athey, G. Imbens, J. Wooldridge","doi":"10.3386/W24003","DOIUrl":null,"url":null,"abstract":"\n Clustered standard errors, with clusters defined by factors such as geography, are widespread in empirical research in economics and many other disciplines. Formally, clustered standard errors adjust for the correlations induced by sampling the outcome variable from a data-generating process with unobserved cluster-level components. However, the standard econometric framework for clustering leaves important questions unanswered: (i) Why do we adjust standard errors for clustering in some ways but not others, e.g., by state but not by gender, and in observational studies, but not in completely randomized experiments? (ii) Why is conventional clustering an “all-or-nothing” adjustment, while within-cluster correlations can be strong or extremely weak? (iii) In what settings does the choice of whether and how to cluster make a difference? We address these and other questions using a novel framework for clustered inference on average treatment effects. In addition to the common sampling component, the new framework incorporates a design component that accounts for the variability induced on the estimator by the treatment assignment mechanism. We show that, when the number of clusters in the sample is a nonnegligible fraction of the number of clusters in the population, conventional cluster standard errors can be severely inflated, and propose new variance estimators that correct for this bias.","PeriodicalId":320844,"journal":{"name":"PSN: Econometrics","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1105","resultStr":"{\"title\":\"When Should You Adjust Standard Errors for Clustering?\",\"authors\":\"Alberto Abadie, S. Athey, G. Imbens, J. Wooldridge\",\"doi\":\"10.3386/W24003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Clustered standard errors, with clusters defined by factors such as geography, are widespread in empirical research in economics and many other disciplines. Formally, clustered standard errors adjust for the correlations induced by sampling the outcome variable from a data-generating process with unobserved cluster-level components. However, the standard econometric framework for clustering leaves important questions unanswered: (i) Why do we adjust standard errors for clustering in some ways but not others, e.g., by state but not by gender, and in observational studies, but not in completely randomized experiments? (ii) Why is conventional clustering an “all-or-nothing” adjustment, while within-cluster correlations can be strong or extremely weak? (iii) In what settings does the choice of whether and how to cluster make a difference? We address these and other questions using a novel framework for clustered inference on average treatment effects. In addition to the common sampling component, the new framework incorporates a design component that accounts for the variability induced on the estimator by the treatment assignment mechanism. We show that, when the number of clusters in the sample is a nonnegligible fraction of the number of clusters in the population, conventional cluster standard errors can be severely inflated, and propose new variance estimators that correct for this bias.\",\"PeriodicalId\":320844,\"journal\":{\"name\":\"PSN: Econometrics\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1105\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PSN: Econometrics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3386/W24003\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PSN: Econometrics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3386/W24003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1105

摘要

聚类标准误差，由地理等因素定义的聚类，在经济学和许多其他学科的实证研究中广泛存在。正式地，聚类标准误差调整了从数据生成过程中对未观察到的聚类级组件的结果变量进行采样所引起的相关性。然而，聚类的标准计量经济学框架留下了一些重要的问题没有回答:(i)为什么我们在某些方面调整聚类的标准误差，而不是其他方面，例如，根据州而不是性别，在观察性研究中，而不是在完全随机的实验中?(ii)为什么传统的聚类是一种“全有或全无”的调整，而聚类内的相关性可能很强，也可能极弱?(iii)在什么情况下，选择是否和如何聚类会产生影响?我们使用一种新的框架来解决这些问题和其他问题，用于对平均治疗效果进行聚类推断。除了常见的抽样组件之外，新的框架还合并了一个设计组件，该组件考虑了处理分配机制在估计器上引起的可变性。我们表明，当样本中的群集数量是总体中群集数量的一个不可忽略的部分时，传统的群集标准误差可能会严重膨胀，并提出新的方差估计器来纠正这种偏差。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

When Should You Adjust Standard Errors for Clustering?

Clustered standard errors, with clusters defined by factors such as geography, are widespread in empirical research in economics and many other disciplines. Formally, clustered standard errors adjust for the correlations induced by sampling the outcome variable from a data-generating process with unobserved cluster-level components. However, the standard econometric framework for clustering leaves important questions unanswered: (i) Why do we adjust standard errors for clustering in some ways but not others, e.g., by state but not by gender, and in observational studies, but not in completely randomized experiments? (ii) Why is conventional clustering an “all-or-nothing” adjustment, while within-cluster correlations can be strong or extremely weak? (iii) In what settings does the choice of whether and how to cluster make a difference? We address these and other questions using a novel framework for clustered inference on average treatment effects. In addition to the common sampling component, the new framework incorporates a design component that accounts for the variability induced on the estimator by the treatment assignment mechanism. We show that, when the number of clusters in the sample is a nonnegligible fraction of the number of clusters in the population, conventional cluster standard errors can be severely inflated, and propose new variance estimators that correct for this bias.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

PSN: Econometrics

自引率

0.00%

发文量