{"title":"Evaluating the generalizability of commercial healthcare claims data.","authors":"Alex Dahlen, Yaowei Deng, Vivek Charu","doi":"10.1093/aje/kwaf142","DOIUrl":null,"url":null,"abstract":"<p><p>Commercial healthcare claims datasets area non-random sample of the US population, affecting generalizability. Rigorous comparisons of claims-derived results to ground-truth data that quantify external validity bias are lacking. Our goal is to (1) quantify external validity of commercial healthcare claims data, and (2) evaluate how socioeconomic/demographic factors are related to the bias. We analyzed inpatient discharge records occurring between 01/01/2019 to 12/31/2019 in five states: California, Iowa, Maryland, Massachusetts, and New Jersey, and compared rates (per person-year) of the 250 most common inpatient procedures between claims and reference data for each target population. We used Merative™ MarketScan® Commercial Database for the claims data and State Inpatient Databases (SID) and the US Census as reference. For a target population of all Americans, commercial healthcare claims underestimate the rate of overall inpatient discharges by 23.1%. The extent of bias varied across procedures, with the rates of ~25% of procedures being underestimated by a factor of 2. Socioeconomic factors were significantly associated with the magnitude of bias (${R}^2=69.4\\%,$p < 0.001). When the target population was restricted to commercially insured Americans, the bias decreased substantially (1.4% of procedures were biased by more than factor of 2), but some variation across procedures remained.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/aje/kwaf142","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Commercial healthcare claims datasets area non-random sample of the US population, affecting generalizability. Rigorous comparisons of claims-derived results to ground-truth data that quantify external validity bias are lacking. Our goal is to (1) quantify external validity of commercial healthcare claims data, and (2) evaluate how socioeconomic/demographic factors are related to the bias. We analyzed inpatient discharge records occurring between 01/01/2019 to 12/31/2019 in five states: California, Iowa, Maryland, Massachusetts, and New Jersey, and compared rates (per person-year) of the 250 most common inpatient procedures between claims and reference data for each target population. We used Merative™ MarketScan® Commercial Database for the claims data and State Inpatient Databases (SID) and the US Census as reference. For a target population of all Americans, commercial healthcare claims underestimate the rate of overall inpatient discharges by 23.1%. The extent of bias varied across procedures, with the rates of ~25% of procedures being underestimated by a factor of 2. Socioeconomic factors were significantly associated with the magnitude of bias (${R}^2=69.4\%,$p < 0.001). When the target population was restricted to commercially insured Americans, the bias decreased substantially (1.4% of procedures were biased by more than factor of 2), but some variation across procedures remained.
期刊介绍:
The American Journal of Epidemiology is the oldest and one of the premier epidemiologic journals devoted to the publication of empirical research findings, opinion pieces, and methodological developments in the field of epidemiologic research.
It is a peer-reviewed journal aimed at both fellow epidemiologists and those who use epidemiologic data, including public health workers and clinicians.