Unifying and linking data sources in medical and public health research

Kavita Batra, Vidhani S. Goel, Ana L. Reyes, Bertille Assoumou, Dodds P. Simangan, Farooq Abdulla, Deborah A. Kuhls
{"title":"Unifying and linking data sources in medical and public health research","authors":"Kavita Batra,&nbsp;Vidhani S. Goel,&nbsp;Ana L. Reyes,&nbsp;Bertille Assoumou,&nbsp;Dodds P. Simangan,&nbsp;Farooq Abdulla,&nbsp;Deborah A. Kuhls","doi":"10.1016/j.glmedi.2024.100164","DOIUrl":null,"url":null,"abstract":"<div><div>Data linkage methods, including probabilistic, deterministic, and hybrid are critical for linking medical and public health records, expanding data scope, and improving research outcomes. These methods differ in accuracy, efficiency, and scalability. This letter seeks to identify best practices for enhancing data quality and linkage rates in healthcare and public health research using these techniques. Data linkage enhances data quality by removing duplicates and correcting artifacts, facilitates cost-effective longitudinal studies by integrating existing data, and supports public health through person-oriented statistics and disease registries. Tools like \"RecordLinkage\" in R and EpiLink have advanced linkage accuracy, particularly in epidemiological studies. A PubMed search in November 2023 identified 176 studies, with 29 meeting inclusion criteria. Hybrid methods showed superior accuracy, with some studies achieving over 90 % linkage rates. Emerging AI-driven methods can further improved scalability, efficiency, and automation, employing privacy-preserving techniques like federated learning to address confidentiality concerns. However, challenges such as inconsistent data, incomplete identifiers, and technical complexities remain, emphasizing the need for standardized protocols and robust ethical frameworks. In low- and middle-income countries (LMICs), tailored strategies such as enhancing health information systems, adopting open-source tools, and fostering regional collaborations are essential to address resource constraints. Initiatives like the Western Australian Data Linkage System exemplify the potential impact of linkage on healthcare and public health. Future research should focus on refining methods, integrating diverse datasets, and leveraging AI to improve linkage efficiency and reliability. By adopting best practices, data linkage can enhance decision-making, optimize interventions, and advance global health research.</div></div>","PeriodicalId":100804,"journal":{"name":"Journal of Medicine, Surgery, and Public Health","volume":"5 ","pages":"Article 100164"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medicine, Surgery, and Public Health","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949916X24001178","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Data linkage methods, including probabilistic, deterministic, and hybrid are critical for linking medical and public health records, expanding data scope, and improving research outcomes. These methods differ in accuracy, efficiency, and scalability. This letter seeks to identify best practices for enhancing data quality and linkage rates in healthcare and public health research using these techniques. Data linkage enhances data quality by removing duplicates and correcting artifacts, facilitates cost-effective longitudinal studies by integrating existing data, and supports public health through person-oriented statistics and disease registries. Tools like "RecordLinkage" in R and EpiLink have advanced linkage accuracy, particularly in epidemiological studies. A PubMed search in November 2023 identified 176 studies, with 29 meeting inclusion criteria. Hybrid methods showed superior accuracy, with some studies achieving over 90 % linkage rates. Emerging AI-driven methods can further improved scalability, efficiency, and automation, employing privacy-preserving techniques like federated learning to address confidentiality concerns. However, challenges such as inconsistent data, incomplete identifiers, and technical complexities remain, emphasizing the need for standardized protocols and robust ethical frameworks. In low- and middle-income countries (LMICs), tailored strategies such as enhancing health information systems, adopting open-source tools, and fostering regional collaborations are essential to address resource constraints. Initiatives like the Western Australian Data Linkage System exemplify the potential impact of linkage on healthcare and public health. Future research should focus on refining methods, integrating diverse datasets, and leveraging AI to improve linkage efficiency and reliability. By adopting best practices, data linkage can enhance decision-making, optimize interventions, and advance global health research.
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信