Unifying and linking data sources in medical and public health research

Kavita Batra, Vidhani S. Goel, Ana L. Reyes, Bertille Assoumou, Dodds P. Simangan, Farooq Abdulla, Deborah A. Kuhls
{"title":"Unifying and linking data sources in medical and public health research","authors":"Kavita Batra,&nbsp;Vidhani S. Goel,&nbsp;Ana L. Reyes,&nbsp;Bertille Assoumou,&nbsp;Dodds P. Simangan,&nbsp;Farooq Abdulla,&nbsp;Deborah A. Kuhls","doi":"10.1016/j.glmedi.2024.100164","DOIUrl":null,"url":null,"abstract":"<div><div>Data linkage methods, including probabilistic, deterministic, and hybrid are critical for linking medical and public health records, expanding data scope, and improving research outcomes. These methods differ in accuracy, efficiency, and scalability. This letter seeks to identify best practices for enhancing data quality and linkage rates in healthcare and public health research using these techniques. Data linkage enhances data quality by removing duplicates and correcting artifacts, facilitates cost-effective longitudinal studies by integrating existing data, and supports public health through person-oriented statistics and disease registries. Tools like \"RecordLinkage\" in R and EpiLink have advanced linkage accuracy, particularly in epidemiological studies. A PubMed search in November 2023 identified 176 studies, with 29 meeting inclusion criteria. Hybrid methods showed superior accuracy, with some studies achieving over 90 % linkage rates. Emerging AI-driven methods can further improved scalability, efficiency, and automation, employing privacy-preserving techniques like federated learning to address confidentiality concerns. However, challenges such as inconsistent data, incomplete identifiers, and technical complexities remain, emphasizing the need for standardized protocols and robust ethical frameworks. In low- and middle-income countries (LMICs), tailored strategies such as enhancing health information systems, adopting open-source tools, and fostering regional collaborations are essential to address resource constraints. Initiatives like the Western Australian Data Linkage System exemplify the potential impact of linkage on healthcare and public health. Future research should focus on refining methods, integrating diverse datasets, and leveraging AI to improve linkage efficiency and reliability. By adopting best practices, data linkage can enhance decision-making, optimize interventions, and advance global health research.</div></div>","PeriodicalId":100804,"journal":{"name":"Journal of Medicine, Surgery, and Public Health","volume":"5 ","pages":"Article 100164"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medicine, Surgery, and Public Health","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949916X24001178","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Data linkage methods, including probabilistic, deterministic, and hybrid are critical for linking medical and public health records, expanding data scope, and improving research outcomes. These methods differ in accuracy, efficiency, and scalability. This letter seeks to identify best practices for enhancing data quality and linkage rates in healthcare and public health research using these techniques. Data linkage enhances data quality by removing duplicates and correcting artifacts, facilitates cost-effective longitudinal studies by integrating existing data, and supports public health through person-oriented statistics and disease registries. Tools like "RecordLinkage" in R and EpiLink have advanced linkage accuracy, particularly in epidemiological studies. A PubMed search in November 2023 identified 176 studies, with 29 meeting inclusion criteria. Hybrid methods showed superior accuracy, with some studies achieving over 90 % linkage rates. Emerging AI-driven methods can further improved scalability, efficiency, and automation, employing privacy-preserving techniques like federated learning to address confidentiality concerns. However, challenges such as inconsistent data, incomplete identifiers, and technical complexities remain, emphasizing the need for standardized protocols and robust ethical frameworks. In low- and middle-income countries (LMICs), tailored strategies such as enhancing health information systems, adopting open-source tools, and fostering regional collaborations are essential to address resource constraints. Initiatives like the Western Australian Data Linkage System exemplify the potential impact of linkage on healthcare and public health. Future research should focus on refining methods, integrating diverse datasets, and leveraging AI to improve linkage efficiency and reliability. By adopting best practices, data linkage can enhance decision-making, optimize interventions, and advance global health research.
统一和连接医疗和公共卫生研究中的数据源
包括概率、确定性和混合在内的数据链接方法对于链接医疗和公共卫生记录、扩大数据范围和改进研究成果至关重要。这些方法在准确性、效率和可伸缩性方面有所不同。本信函旨在确定使用这些技术提高医疗保健和公共卫生研究中的数据质量和关联率的最佳做法。数据链接通过消除重复和纠正人为因素提高数据质量,通过整合现有数据促进具有成本效益的纵向研究,并通过面向个人的统计和疾病登记支持公共卫生。像R中的“RecordLinkage”和EpiLink这样的工具具有高级的链接准确性,特别是在流行病学研究中。2023年11月的PubMed检索确定了176项研究,其中29项符合纳入标准。混合方法显示出更高的准确性,一些研究达到了90% %以上的连锁率。新兴的人工智能驱动的方法可以进一步提高可扩展性、效率和自动化,采用联邦学习等隐私保护技术来解决机密性问题。然而,诸如数据不一致、标识符不完整和技术复杂性等挑战仍然存在,这强调了对标准化协议和健全的道德框架的需求。在低收入和中等收入国家,加强卫生信息系统、采用开源工具和促进区域合作等量身定制的战略对于解决资源限制至关重要。像西澳大利亚数据联系系统这样的倡议体现了联系对保健和公共卫生的潜在影响。未来的研究应侧重于改进方法,整合不同的数据集,并利用人工智能来提高链接的效率和可靠性。通过采用最佳做法,数据联系可以加强决策,优化干预措施,并推进全球卫生研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信