Microsimulation of an educational attainment register to study record linkage quality.

IF 1.6 Q3 HEALTH CARE SCIENCES & SERVICES
Maya Murmann, Douglas Manuel
{"title":"Microsimulation of an educational attainment register to study record linkage quality.","authors":"Maya Murmann, Douglas Manuel","doi":"10.23889/ijpds.v7i3.1848","DOIUrl":null,"url":null,"abstract":"Population covering educational attainment registers have been proven helpful for planning and research concerning educational efforts. Regular linking of different databases is needed to build and update such a register. Without unique national identification numbers, record linkage must be based on quasi-identifiers such as names, date of birth and sex. High-quality record linkage require the unique identification of persons. Therefore, available identifiers should be sufficient for unique identification despite missing identifiers for some cases. Redundant identifiers can achieve this goal. However, the data protection principle of data minimization, as recommended in the European General Data Protection Regulation, aims to avoid additional data if possible for the given purpose. Therefore, a ministry commissioned a simulation study to inform legislators on the minimum set of identifiers needed for a national register. A microsimulation of the population consisting of nearly 20 million people was implemented to generate data on accumulating changes and errors in identifiers over ten simulated years. The simulation covered, for example, international migration, regional mobility, marriages, school careers and mortality. Each event triggered changes of identifiers according to specified error probability models. The resulting data were linked by different record-linkage procedures. Linkage quality and linkage bias dependent on the available identifiers were assessed. We report on the design of the simulation study, the linkage results and recommendations for the minimum set of identifiers. The results may be helpful for the design of other population covering registers.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Population Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23889/ijpds.v7i3.1848","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Population covering educational attainment registers have been proven helpful for planning and research concerning educational efforts. Regular linking of different databases is needed to build and update such a register. Without unique national identification numbers, record linkage must be based on quasi-identifiers such as names, date of birth and sex. High-quality record linkage require the unique identification of persons. Therefore, available identifiers should be sufficient for unique identification despite missing identifiers for some cases. Redundant identifiers can achieve this goal. However, the data protection principle of data minimization, as recommended in the European General Data Protection Regulation, aims to avoid additional data if possible for the given purpose. Therefore, a ministry commissioned a simulation study to inform legislators on the minimum set of identifiers needed for a national register. A microsimulation of the population consisting of nearly 20 million people was implemented to generate data on accumulating changes and errors in identifiers over ten simulated years. The simulation covered, for example, international migration, regional mobility, marriages, school careers and mortality. Each event triggered changes of identifiers according to specified error probability models. The resulting data were linked by different record-linkage procedures. Linkage quality and linkage bias dependent on the available identifiers were assessed. We report on the design of the simulation study, the linkage results and recommendations for the minimum set of identifiers. The results may be helpful for the design of other population covering registers.
教育程度登记册的微观模拟,以研究记录的联系质量。
事实证明,涵盖受教育程度登记册的人口有助于有关教育努力的规划和研究。建立和更新这样一个登记册需要定期连接不同的数据库。如果没有唯一的国民识别号码,记录链接必须基于姓名、出生日期和性别等准标识符。高质量的记录联动需要人员的唯一标识。因此,尽管在某些情况下缺少标识符,可用的标识符应该足以用于惟一标识。冗余标识符可以实现这一目标。然而,数据最小化的数据保护原则,正如欧洲通用数据保护条例所建议的那样,旨在尽可能避免为给定目的提供额外数据。因此,一个部门委托进行了一项模拟研究,以告知立法者国家登记册所需的最低标识符集。对近2000万人的人口进行了微观模拟,以生成10年模拟期间标识符累积变化和错误的数据。例如,模拟涵盖了国际移徙、区域流动、婚姻、学业和死亡率。每个事件根据指定的错误概率模型触发标识符的更改。结果数据通过不同的记录链接程序链接。评估了依赖于可用标识符的连锁质量和连锁偏差。我们报告了模拟研究的设计,链接结果和最小标识符集的建议。研究结果可为其他人口覆盖登记的设计提供参考。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.50
自引率
0.00%
发文量
386
审稿时长
20 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信