Development and Evaluation of Record Linkage Rules in a Safety-Net Health System Serving Disadvantaged Communities

ACI open Pub Date : 2019-07-01 DOI:10.1055/S-0039-1693129

W. Trick, K. Doshi, Michael J Ray, F. Angulo

{"title":"Development and Evaluation of Record Linkage Rules in a Safety-Net Health System Serving Disadvantaged Communities","authors":"W. Trick, K. Doshi, Michael J Ray, F. Angulo","doi":"10.1055/S-0039-1693129","DOIUrl":null,"url":null,"abstract":"Abstract Background There is a need for flexible, accurate record-linkage systems with transparent rules that work across diverse populations. Objectives We developed rules responsive to challenges in linking records for an urban safety-net health system; we calculated performance characteristics for our algorithm. Methods We evaluated encounters during January 1, 2012 through September 30, 2018. We compared our algorithm, using name (first-last), date-of-birth (DOB), and last four of social security number to our electronic health record (EHR) system's reconciliation process. We applied our algorithm to unreconciled real-time Admission-Discharge-Transfer registration data, and compared match results to reconciled identities from our enterprise data warehouse. We manually validated matches for randomly sampled discordant pairs; we calculated sensitivity/specificity. We evaluated predictors of discordance, including census tract information. Results Of 771,477 unique medical record numbers, most (95%) were concordant between systems; a substantial minority (5%) was discordant. Of 38,993 discordant pairs, most (n = 36,539; 94%) were detected by our local algorithm. The sensitivity of our algorithm was higher than the EHR process (99% vs. 81%), but with lower specificity (98.6% vs. 99.9%). Our highest-yield rules, beyond full first and last name plus complete DOB match, were first three initials of first name, transposed first-last names, and DOB offsets (+1 and +365 days). Factors associated with discordance were homelessness (adjusted odds ratio [aOR] = 2.4; 95% confidence interval [CI], 2.2–2.6) and living in a census tract with high levels of poverty (aOR = 1.4; 95% CI, 1.3–1.4). Conclusion Our algorithm had superior sensitivity compared to our EHR process. Homelessness and poverty were associated with unmatched records. Improved sensitivity was attributable to several critical input-variable processing steps useful for similar difficult-to-link populations.","PeriodicalId":72041,"journal":{"name":"ACI open","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1055/S-0039-1693129","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACI open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1055/S-0039-1693129","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Abstract Background There is a need for flexible, accurate record-linkage systems with transparent rules that work across diverse populations. Objectives We developed rules responsive to challenges in linking records for an urban safety-net health system; we calculated performance characteristics for our algorithm. Methods We evaluated encounters during January 1, 2012 through September 30, 2018. We compared our algorithm, using name (first-last), date-of-birth (DOB), and last four of social security number to our electronic health record (EHR) system's reconciliation process. We applied our algorithm to unreconciled real-time Admission-Discharge-Transfer registration data, and compared match results to reconciled identities from our enterprise data warehouse. We manually validated matches for randomly sampled discordant pairs; we calculated sensitivity/specificity. We evaluated predictors of discordance, including census tract information. Results Of 771,477 unique medical record numbers, most (95%) were concordant between systems; a substantial minority (5%) was discordant. Of 38,993 discordant pairs, most (n = 36,539; 94%) were detected by our local algorithm. The sensitivity of our algorithm was higher than the EHR process (99% vs. 81%), but with lower specificity (98.6% vs. 99.9%). Our highest-yield rules, beyond full first and last name plus complete DOB match, were first three initials of first name, transposed first-last names, and DOB offsets (+1 and +365 days). Factors associated with discordance were homelessness (adjusted odds ratio [aOR] = 2.4; 95% confidence interval [CI], 2.2–2.6) and living in a census tract with high levels of poverty (aOR = 1.4; 95% CI, 1.3–1.4). Conclusion Our algorithm had superior sensitivity compared to our EHR process. Homelessness and poverty were associated with unmatched records. Improved sensitivity was attributable to several critical input-variable processing steps useful for similar difficult-to-link populations.

查看原文本刊更多论文

服务弱势社区的安全网卫生系统记录联动规则的制定与评价

需要灵活、准确的记录链接系统，具有透明的规则，适用于不同的人群。我们制定了规则，以应对连接城市安全网卫生系统记录方面的挑战;我们计算了算法的性能特征。方法对2012年1月1日至2018年9月30日期间的就诊情况进行评估。我们比较了我们的算法，使用姓名(首-尾)、出生日期(DOB)和社会保险号的后四位与我们的电子健康记录(EHR)系统的对账过程。我们将算法应用于不协调的实时入院-出院-转院注册数据，并将匹配结果与来自企业数据仓库的协调身份进行比较。我们手动验证随机抽样的不一致对的匹配;我们计算了敏感性/特异性。我们评估了不一致的预测因素，包括人口普查区信息。结果771477个唯一病案号中，大多数(95%)系统间一致;相当一部分人(5%)不同意。在38993对不一致的配对中，大多数(n = 36539;94%)被我们的局部算法检测到。该算法的敏感性高于EHR(99%比81%)，但特异性较低(98.6%比99.9%)。除了完整的姓氏和姓氏加上完整的DOB匹配之外，我们的最高收益规则是名字的前三个首字母，调换的姓氏和DOB偏移量(+1和+365天)。与不一致相关的因素是无家可归(校正优势比[aOR] = 2.4;95%置信区间[CI]， 2.2-2.6)和生活在高贫困人口普查区(aOR = 1.4;95% ci, 1.3-1.4)。结论与电子病历相比，该算法具有更高的灵敏度。无家可归和贫穷与无与伦比的记录联系在一起。灵敏度的提高可归因于几个关键的输入变量处理步骤，这些步骤对类似的难以联系的种群有用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACI open

自引率

0.00%

发文量