链接和易读性:使历史美国人口普查自动链接方法的意义*

Arkadev Ghosh, S. Hwang, Munir Squires
{"title":"链接和易读性:使历史美国人口普查自动链接方法的意义*","authors":"Arkadev Ghosh, S. Hwang, Munir Squires","doi":"10.1080/07350015.2023.2205918","DOIUrl":null,"url":null,"abstract":"How does handwriting legibility affect the performance of algorithms that link individuals across census rounds? We propose a measure of legibility, which we implement at scale for the 1940 US Census, and find strikingly wide variation in enumeration-district-level legibility. Using boundary discontinuities in enumeration districts, we estimate the causal effect of low legibility on the quality of linked samples, measured by linkage rates and share of validated links. Our estimates imply that, across eight linking algorithms, perfect legibility would increase the linkage rate by 5 to 10 percentage points. Improvements in transcription could substantially increase the quality of linked samples. *We thank Santiago Pérez and seminar participants at Midwest Economic Association conference, Western Economic Association virtual international conference, and UBC Econometrics lunch for their valuable comments. This research was undertaken thanks to funding from the Canada Excellence Research Chairs program awarded to Dr. Erik Snowberg in Data-Intensive Methods in Economics. Correspondence can be addressed to hwangii@mail.ubc.ca †briq: Institute on Behavior and Inequality ‡University of British Columbia §University of British Columbia 1","PeriodicalId":118766,"journal":{"name":"Journal of Business & Economic Statistics","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Links and legibility: Making sense of historical US Census automated linking methods *\",\"authors\":\"Arkadev Ghosh, S. Hwang, Munir Squires\",\"doi\":\"10.1080/07350015.2023.2205918\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"How does handwriting legibility affect the performance of algorithms that link individuals across census rounds? We propose a measure of legibility, which we implement at scale for the 1940 US Census, and find strikingly wide variation in enumeration-district-level legibility. Using boundary discontinuities in enumeration districts, we estimate the causal effect of low legibility on the quality of linked samples, measured by linkage rates and share of validated links. Our estimates imply that, across eight linking algorithms, perfect legibility would increase the linkage rate by 5 to 10 percentage points. Improvements in transcription could substantially increase the quality of linked samples. *We thank Santiago Pérez and seminar participants at Midwest Economic Association conference, Western Economic Association virtual international conference, and UBC Econometrics lunch for their valuable comments. This research was undertaken thanks to funding from the Canada Excellence Research Chairs program awarded to Dr. Erik Snowberg in Data-Intensive Methods in Economics. Correspondence can be addressed to hwangii@mail.ubc.ca †briq: Institute on Behavior and Inequality ‡University of British Columbia §University of British Columbia 1\",\"PeriodicalId\":118766,\"journal\":{\"name\":\"Journal of Business & Economic Statistics\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Business & Economic Statistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/07350015.2023.2205918\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Business & Economic Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/07350015.2023.2205918","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

笔迹易读性如何影响将人口普查中的个人联系起来的算法的性能?我们提出了一个易读性的衡量标准,我们在1940年美国人口普查中大规模实施,并发现在枚举地区级别的易读性方面存在惊人的广泛差异。利用枚举区的边界不连续,我们估计了低易读性对链接样本质量的因果影响,通过链接率和验证链接的份额来衡量。我们的估计表明,在八种链接算法中,完美的易读性将使链接率提高5到10个百分点。转录的改进可以大大提高连接样本的质量。*我们感谢Santiago psamurez以及中西部经济协会会议、西部经济协会虚拟国际会议和UBC计量经济学午餐会的与会者提供的宝贵意见。这项研究是由加拿大卓越研究主席计划资助的,该计划授予Erik Snowberg博士在经济学中的数据密集型方法。通信可发送至hwangii@mail.ubc.ca†briq:行为与不平等研究所‡不列颠哥伦比亚大学§不列颠哥伦比亚大学1
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Links and legibility: Making sense of historical US Census automated linking methods *
How does handwriting legibility affect the performance of algorithms that link individuals across census rounds? We propose a measure of legibility, which we implement at scale for the 1940 US Census, and find strikingly wide variation in enumeration-district-level legibility. Using boundary discontinuities in enumeration districts, we estimate the causal effect of low legibility on the quality of linked samples, measured by linkage rates and share of validated links. Our estimates imply that, across eight linking algorithms, perfect legibility would increase the linkage rate by 5 to 10 percentage points. Improvements in transcription could substantially increase the quality of linked samples. *We thank Santiago Pérez and seminar participants at Midwest Economic Association conference, Western Economic Association virtual international conference, and UBC Econometrics lunch for their valuable comments. This research was undertaken thanks to funding from the Canada Excellence Research Chairs program awarded to Dr. Erik Snowberg in Data-Intensive Methods in Economics. Correspondence can be addressed to hwangii@mail.ubc.ca †briq: Institute on Behavior and Inequality ‡University of British Columbia §University of British Columbia 1
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信