{"title":"用Wasserstein生成对抗网络设计蒙特卡洛仿真","authors":"S. Athey, G. Imbens, Jonas Metzger, Evan Munro","doi":"10.3386/w26566","DOIUrl":null,"url":null,"abstract":"When researchers develop new econometric methods it is common practice to compare the performance of the new methods to those of existing methods in Monte Carlo studies. The credibility of such Monte Carlo studies is often limited because of the freedom the researcher has in choosing the design. In recent years a new class of generative models emerged in the machine learning literature, termed Generative Adversarial Networks (GANs) that can be used to systematically generate artificial data that closely mimics real economic datasets, while limiting the degrees of freedom for the researcher and optionally satisfying privacy guarantees with respect to their training data. In addition if an applied researcher is concerned with the performance of a particular statistical method on a specific data set (beyond its theoretical properties in large samples), she may wish to assess the performance, e.g., the coverage rate of confidence intervals or the bias of the estimator, using simulated data which resembles her setting. Tol illustrate these methods we apply Wasserstein GANs (WGANs) to compare a number of different estimators for average treatment effects under unconfoundedness in three distinct settings (corresponding to three real data sets) and present a methodology for assessing the robustness of the results. In this example, we find that (i) there is not one estimator that outperforms the others in all three settings, so researchers should tailor their analytic approach to a given setting, and (ii) systematic simulation studies can be helpful for selecting among competing methods in this situation.","PeriodicalId":11465,"journal":{"name":"Econometrics: Econometric & Statistical Methods - General eJournal","volume":"10 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"63","resultStr":"{\"title\":\"Using Wasserstein Generative Adversarial Networks for the Design of Monte Carlo Simulations\",\"authors\":\"S. Athey, G. Imbens, Jonas Metzger, Evan Munro\",\"doi\":\"10.3386/w26566\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When researchers develop new econometric methods it is common practice to compare the performance of the new methods to those of existing methods in Monte Carlo studies. The credibility of such Monte Carlo studies is often limited because of the freedom the researcher has in choosing the design. In recent years a new class of generative models emerged in the machine learning literature, termed Generative Adversarial Networks (GANs) that can be used to systematically generate artificial data that closely mimics real economic datasets, while limiting the degrees of freedom for the researcher and optionally satisfying privacy guarantees with respect to their training data. In addition if an applied researcher is concerned with the performance of a particular statistical method on a specific data set (beyond its theoretical properties in large samples), she may wish to assess the performance, e.g., the coverage rate of confidence intervals or the bias of the estimator, using simulated data which resembles her setting. Tol illustrate these methods we apply Wasserstein GANs (WGANs) to compare a number of different estimators for average treatment effects under unconfoundedness in three distinct settings (corresponding to three real data sets) and present a methodology for assessing the robustness of the results. In this example, we find that (i) there is not one estimator that outperforms the others in all three settings, so researchers should tailor their analytic approach to a given setting, and (ii) systematic simulation studies can be helpful for selecting among competing methods in this situation.\",\"PeriodicalId\":11465,\"journal\":{\"name\":\"Econometrics: Econometric & Statistical Methods - General eJournal\",\"volume\":\"10 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"63\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Econometrics: Econometric & Statistical Methods - General eJournal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3386/w26566\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Econometrics: Econometric & Statistical Methods - General eJournal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3386/w26566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 63
摘要
当研究人员开发新的计量经济学方法时,将新方法的性能与蒙特卡罗研究中现有方法的性能进行比较是常见的做法。由于研究人员在选择设计时的自由,这种蒙特卡罗研究的可信度往往受到限制。近年来,机器学习文献中出现了一类新的生成模型,称为生成对抗网络(GANs),可用于系统地生成与真实经济数据集非常相似的人工数据,同时限制研究人员的自由度,并可选择满足其训练数据的隐私保证。此外,如果应用研究人员关注特定统计方法在特定数据集上的性能(超出其在大样本中的理论特性),她可能希望使用与她的设置相似的模拟数据来评估性能,例如,置信区间的覆盖率或估计器的偏差。为了说明这些方法,我们应用Wasserstein gan (wgan)来比较三种不同设置(对应于三个真实数据集)中无混杂情况下平均治疗效果的许多不同估计,并提出了评估结果稳健性的方法。在这个例子中,我们发现(i)没有一个估计器在所有三种设置中都优于其他估计器,因此研究人员应该根据给定的设置定制他们的分析方法,并且(ii)系统模拟研究可以帮助在这种情况下选择竞争方法。
Using Wasserstein Generative Adversarial Networks for the Design of Monte Carlo Simulations
When researchers develop new econometric methods it is common practice to compare the performance of the new methods to those of existing methods in Monte Carlo studies. The credibility of such Monte Carlo studies is often limited because of the freedom the researcher has in choosing the design. In recent years a new class of generative models emerged in the machine learning literature, termed Generative Adversarial Networks (GANs) that can be used to systematically generate artificial data that closely mimics real economic datasets, while limiting the degrees of freedom for the researcher and optionally satisfying privacy guarantees with respect to their training data. In addition if an applied researcher is concerned with the performance of a particular statistical method on a specific data set (beyond its theoretical properties in large samples), she may wish to assess the performance, e.g., the coverage rate of confidence intervals or the bias of the estimator, using simulated data which resembles her setting. Tol illustrate these methods we apply Wasserstein GANs (WGANs) to compare a number of different estimators for average treatment effects under unconfoundedness in three distinct settings (corresponding to three real data sets) and present a methodology for assessing the robustness of the results. In this example, we find that (i) there is not one estimator that outperforms the others in all three settings, so researchers should tailor their analytic approach to a given setting, and (ii) systematic simulation studies can be helpful for selecting among competing methods in this situation.