Improving disease misclassification and prevalence estimates by linking primary and secondary care electronic health records: an illustration from arthritis research.
IF 4.8 2区 医学Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Belay Birlie Yimer, Fangyuan Zhang, Jenny Humphreys, Mark Lunt, Meghna Jani, John McBeth, William G Dixon
{"title":"Improving disease misclassification and prevalence estimates by linking primary and secondary care electronic health records: an illustration from arthritis research.","authors":"Belay Birlie Yimer, Fangyuan Zhang, Jenny Humphreys, Mark Lunt, Meghna Jani, John McBeth, William G Dixon","doi":"10.1093/aje/kwaf206","DOIUrl":null,"url":null,"abstract":"<p><p>Prevalence estimates using primary care data health identify cases via code lists. Validation studies can discover and exclude false positives, but it is often difficult or impossible to find false negatives. This study aimed, using the example of psoriatic arthritis (PsA), to examine the extent of and adjust for misclassification by linking primary care records with text-mined outpatient letters from a North-West regional hospital (2014-2019). 245 cases of PsA were identified among 188,286 adults registered with primary care, giving an observed prevalence of 0.13% [95%CI 0.11%-0.15%]. Among a subgroup of 7,532 primary care patients attending the hospital rheumatology clinic, 202 had a primary care PsA code: 188 were confirmed as true PsA, while 14 were false positives. Primary care codes failed to identify 196 hospital-diagnosed PsA cases, leading to a more than two-fold underestimation. The adjusted prevalence, accounting for misclassification, was 0.25% [95% CI 0.21%-0.28%]. Linking primary care with hospital records identified false positives and negatives, enabling correction of prevalence estimates. This highlights the value of text-mining hospital letters to replace the national absence of coded secondary care diagnosis data from outpatient departments, and the importance of considering the impact of false negatives.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/aje/kwaf206","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Prevalence estimates using primary care data health identify cases via code lists. Validation studies can discover and exclude false positives, but it is often difficult or impossible to find false negatives. This study aimed, using the example of psoriatic arthritis (PsA), to examine the extent of and adjust for misclassification by linking primary care records with text-mined outpatient letters from a North-West regional hospital (2014-2019). 245 cases of PsA were identified among 188,286 adults registered with primary care, giving an observed prevalence of 0.13% [95%CI 0.11%-0.15%]. Among a subgroup of 7,532 primary care patients attending the hospital rheumatology clinic, 202 had a primary care PsA code: 188 were confirmed as true PsA, while 14 were false positives. Primary care codes failed to identify 196 hospital-diagnosed PsA cases, leading to a more than two-fold underestimation. The adjusted prevalence, accounting for misclassification, was 0.25% [95% CI 0.21%-0.28%]. Linking primary care with hospital records identified false positives and negatives, enabling correction of prevalence estimates. This highlights the value of text-mining hospital letters to replace the national absence of coded secondary care diagnosis data from outpatient departments, and the importance of considering the impact of false negatives.
期刊介绍:
The American Journal of Epidemiology is the oldest and one of the premier epidemiologic journals devoted to the publication of empirical research findings, opinion pieces, and methodological developments in the field of epidemiologic research.
It is a peer-reviewed journal aimed at both fellow epidemiologists and those who use epidemiologic data, including public health workers and clinicians.