C. Bowen, V. Bryant, Leonard Burman, Surachai Khitatrakun, R. McClelland, Livia Mucciolo, Madeline Pickens, Aaron R. Williams
{"title":"Synthetic Individual Income Tax Data: Promises and Challenges","authors":"C. Bowen, V. Bryant, Leonard Burman, Surachai Khitatrakun, R. McClelland, Livia Mucciolo, Madeline Pickens, Aaron R. Williams","doi":"10.1086/722094","DOIUrl":null,"url":null,"abstract":"Tax data are invaluable for research, but privacy concerns severely limit access. Although the US Internal Revenue Service produces a public-use file (PUF), improved technology and the proliferation of individual data have made it increasingly difficult to protect. Synthetic data are an alternative that reproduce the statistical properties of administrative data without revealing individual taxpayer information. This paper evaluates the quality and safety of the first fully synthetic PUF and demonstrates its performance in tax model microsimulations. The synthetic PUF could also be used to develop and debug statistical programs that could then be safely run on confidential data via a validation server.","PeriodicalId":18983,"journal":{"name":"National Tax Journal","volume":"75 1","pages":"767 - 790"},"PeriodicalIF":1.8000,"publicationDate":"2022-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"National Tax Journal","FirstCategoryId":"96","ListUrlMain":"https://doi.org/10.1086/722094","RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}
引用次数: 3
Abstract
Tax data are invaluable for research, but privacy concerns severely limit access. Although the US Internal Revenue Service produces a public-use file (PUF), improved technology and the proliferation of individual data have made it increasingly difficult to protect. Synthetic data are an alternative that reproduce the statistical properties of administrative data without revealing individual taxpayer information. This paper evaluates the quality and safety of the first fully synthetic PUF and demonstrates its performance in tax model microsimulations. The synthetic PUF could also be used to develop and debug statistical programs that could then be safely run on confidential data via a validation server.
期刊介绍:
The goal of the National Tax Journal (NTJ) is to encourage and disseminate high quality original research on governmental tax and expenditure policies. Articles published in the regular March, June and September issues of the journal, as well as articles accepted for publication in special issues of the journal, are subject to professional peer review and include economic, theoretical, and empirical analyses of tax and expenditure issues with an emphasis on policy implications. The NTJ has been published quarterly since 1948 under the auspices of the National Tax Association (NTA). Most issues include an NTJ Forum, which consists of invited papers by leading scholars that examine in depth a single current tax or expenditure policy issue. The December issue is devoted to publishing papers presented at the NTA’s annual Spring Symposium; the articles in the December issue generally are not subject to peer review.