The Associations Between Sensitivity and Specificity With Prevalence in Data Matching.

IF 1.9 4区 医学 Q2 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Qiang Xia, Kacie Seil, Prima Manandhar-Sasaki, Daniel Bertolino, Valentina Mara, Lucia V Torian, Wenhui Li
{"title":"The Associations Between Sensitivity and Specificity With Prevalence in Data Matching.","authors":"Qiang Xia, Kacie Seil, Prima Manandhar-Sasaki, Daniel Bertolino, Valentina Mara, Lucia V Torian, Wenhui Li","doi":"10.1097/PHH.0000000000002355","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To assess the associations between sensitivity and specificity with prevalence in data matching.</p><p><strong>Methods: </strong>Using publicly available data, a synthetic dataset of names (\"source population\"; 8 million records) was created with records randomly assigned as positive or negative for a health outcome, as well as sex and birth date. All positives were included in file 1 (\"disease registry\"), and a random sample of positives and negatives were selected and merged to create file 2 (\"study population\"). The prevalence in the source population was defined as the proportion of individuals in the synthetic dataset who were randomly assigned as positive, and the prevalence in the study population as the proportion of individuals in the study population who were positive. Multiple disease registry and study population file pairs were created and matched with various prevalence in the source and study populations. Link Plus 3.0, a probabilistic record linkage program, was used for the data matching.</p><p><strong>Results: </strong>As the prevalence in the source population increases from 0.1% to 10%, the sensitivity increases from 80.0% to 94.6% and the specificity decreases slightly; as the prevalence in the study population increases from 10% to 99%, the sensitivity remains stable around 95.0% and the specificity stays at about 100.0%.</p><p><strong>Conclusions: </strong>In data matching, the sensitivity is positively and the specificity is negatively associated with the prevalence in the source population, but not associated with the prevalence in the study population.</p>","PeriodicalId":47855,"journal":{"name":"Journal of Public Health Management and Practice","volume":" ","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2026-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Public Health Management and Practice","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/PHH.0000000000002355","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: To assess the associations between sensitivity and specificity with prevalence in data matching.

Methods: Using publicly available data, a synthetic dataset of names ("source population"; 8 million records) was created with records randomly assigned as positive or negative for a health outcome, as well as sex and birth date. All positives were included in file 1 ("disease registry"), and a random sample of positives and negatives were selected and merged to create file 2 ("study population"). The prevalence in the source population was defined as the proportion of individuals in the synthetic dataset who were randomly assigned as positive, and the prevalence in the study population as the proportion of individuals in the study population who were positive. Multiple disease registry and study population file pairs were created and matched with various prevalence in the source and study populations. Link Plus 3.0, a probabilistic record linkage program, was used for the data matching.

Results: As the prevalence in the source population increases from 0.1% to 10%, the sensitivity increases from 80.0% to 94.6% and the specificity decreases slightly; as the prevalence in the study population increases from 10% to 99%, the sensitivity remains stable around 95.0% and the specificity stays at about 100.0%.

Conclusions: In data matching, the sensitivity is positively and the specificity is negatively associated with the prevalence in the source population, but not associated with the prevalence in the study population.

数据匹配中敏感性和特异性与患病率之间的关系。
目的:评估数据匹配中敏感性和特异性与患病率之间的关系。方法:利用公开可用的数据,创建了一个姓名合成数据集(“源人口”;800万条记录),其中的记录随机分配为健康结果的阳性或阴性,以及性别和出生日期。所有阳性病例被纳入文件1(“疾病登记”),随机选择阳性和阴性样本合并创建文件2(“研究人群”)。源人群中的患病率定义为合成数据集中随机分配为阳性的个体的比例,研究人群中的患病率定义为研究人群中阳性个体的比例。创建了多个疾病登记和研究人群档案对,并与源人群和研究人群中的各种患病率进行匹配。采用概率记录联动程序Link Plus 3.0进行数据匹配。结果:随着源人群患病率从0.1%增加到10%,敏感性从80.0%增加到94.6%,特异性略有下降;随着研究人群患病率从10%增加到99%,敏感性稳定在95.0%左右,特异性保持在100.0%左右。结论:在数据匹配中,敏感性与源人群患病率呈正相关,特异性与研究人群患病率呈负相关。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Public Health Management and Practice
Journal of Public Health Management and Practice PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH-
CiteScore
3.40
自引率
9.10%
发文量
287
期刊介绍: Journal of Public Health Management and Practice publishes articles which focus on evidence based public health practice and research. The journal is a bi-monthly peer-reviewed publication guided by a multidisciplinary editorial board of administrators, practitioners and scientists. Journal of Public Health Management and Practice publishes in a wide range of population health topics including research to practice; emergency preparedness; bioterrorism; infectious disease surveillance; environmental health; community health assessment, chronic disease prevention and health promotion, and academic-practice linkages.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书