评估数据标准化对真实世界数据的影响。

IF 2.4 4区 医学 Q3 PHARMACOLOGY & PHARMACY
Elizabeth M Garry, Aidan Baglivo, Priya Govil, Jennifer L Duryea, Wei Liu, Tamar Lasky, Aloka Chakravarty, Donna R Rivera, Marie C Bradley
{"title":"评估数据标准化对真实世界数据的影响。","authors":"Elizabeth M Garry, Aidan Baglivo, Priya Govil, Jennifer L Duryea, Wei Liu, Tamar Lasky, Aloka Chakravarty, Donna R Rivera, Marie C Bradley","doi":"10.1002/pds.70191","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To understand the impact of standardizing administrative healthcare data to the Sentinel common data model for cohort selection and descriptive findings.</p><p><strong>Methods: </strong>Among patients with an outpatient COVID-19 diagnosis (January 2021-December 2022) in HealthVerity using the data in its native and the standardized format, we descriptively compared cohort attrition and sample size, patient characteristics, and healthcare resource utilization during baseline and incidence of selected conditions after COVID-19 diagnosis.</p><p><strong>Results: </strong>The standardized cohort included fewer patients than the native (164 445 vs. 198 317), but age (median 48 years) and sex (70% female) were the same. The distribution of race was similar; however, the standardized cohort mapped patients with \"Other\" race to the \"Unknown/Missing\" race category, which created differences among those categories. Distributions were similar, albeit slightly lower for comorbidities (differences < 1%), and lower for SARS-CoV-2 diagnostic tests (59% vs. 70%). Medical encounter counts were also lower, with substantial differences that were attenuated after limiting encounter counts to one event per day (e.g., mean count of 6.0 vs. 27.7 specialty care visits reduced to 2.9 vs. 3.5). Incidence rates were lower, with the greatest difference for hepatotoxicity (29.6 vs. 37.1 per 1000 person-years).</p><p><strong>Conclusions: </strong>The data standardization refines the data (e.g., removes duplicate claims and variables or variable categories), which may reduce outliers and errors but yield lower distributions and counts of certain variables than observed in native format data. Therefore, it is critical to understand how standardization impacts the data and subsequently its fitness for use.</p>","PeriodicalId":19782,"journal":{"name":"Pharmacoepidemiology and Drug Safety","volume":"34 8","pages":"e70191"},"PeriodicalIF":2.4000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating the Impact of Data Standardization on Real-World Data.\",\"authors\":\"Elizabeth M Garry, Aidan Baglivo, Priya Govil, Jennifer L Duryea, Wei Liu, Tamar Lasky, Aloka Chakravarty, Donna R Rivera, Marie C Bradley\",\"doi\":\"10.1002/pds.70191\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>To understand the impact of standardizing administrative healthcare data to the Sentinel common data model for cohort selection and descriptive findings.</p><p><strong>Methods: </strong>Among patients with an outpatient COVID-19 diagnosis (January 2021-December 2022) in HealthVerity using the data in its native and the standardized format, we descriptively compared cohort attrition and sample size, patient characteristics, and healthcare resource utilization during baseline and incidence of selected conditions after COVID-19 diagnosis.</p><p><strong>Results: </strong>The standardized cohort included fewer patients than the native (164 445 vs. 198 317), but age (median 48 years) and sex (70% female) were the same. The distribution of race was similar; however, the standardized cohort mapped patients with \\\"Other\\\" race to the \\\"Unknown/Missing\\\" race category, which created differences among those categories. Distributions were similar, albeit slightly lower for comorbidities (differences < 1%), and lower for SARS-CoV-2 diagnostic tests (59% vs. 70%). Medical encounter counts were also lower, with substantial differences that were attenuated after limiting encounter counts to one event per day (e.g., mean count of 6.0 vs. 27.7 specialty care visits reduced to 2.9 vs. 3.5). Incidence rates were lower, with the greatest difference for hepatotoxicity (29.6 vs. 37.1 per 1000 person-years).</p><p><strong>Conclusions: </strong>The data standardization refines the data (e.g., removes duplicate claims and variables or variable categories), which may reduce outliers and errors but yield lower distributions and counts of certain variables than observed in native format data. Therefore, it is critical to understand how standardization impacts the data and subsequently its fitness for use.</p>\",\"PeriodicalId\":19782,\"journal\":{\"name\":\"Pharmacoepidemiology and Drug Safety\",\"volume\":\"34 8\",\"pages\":\"e70191\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pharmacoepidemiology and Drug Safety\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1002/pds.70191\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"PHARMACOLOGY & PHARMACY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pharmacoepidemiology and Drug Safety","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/pds.70191","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
引用次数: 0

摘要

目的:了解标准化行政医疗数据对Sentinel公共数据模型的影响,用于队列选择和描述性发现。方法:在HealthVerity中使用原生和标准化格式的门诊COVID-19诊断患者(2021年1月- 2022年12月)中,我们描述性地比较了队列消耗和样本量、患者特征、基线期间的医疗资源利用率和COVID-19诊断后选定疾病的发病率。结果:标准化队列纳入的患者少于本地队列(164 445对198 317),但年龄(中位48岁)和性别(70%为女性)相同。种族分布相似;然而,标准化队列将“其他”种族的患者映射到“未知/失踪”种族类别,这在这些类别之间产生了差异。结论:数据标准化改进了数据(例如,删除了重复的索赔和变量或变量类别),这可能会减少异常值和错误,但产生的分布和某些变量的计数比在原生格式数据中观察到的要低。因此,理解标准化如何影响数据及其适用性是至关重要的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluating the Impact of Data Standardization on Real-World Data.

Purpose: To understand the impact of standardizing administrative healthcare data to the Sentinel common data model for cohort selection and descriptive findings.

Methods: Among patients with an outpatient COVID-19 diagnosis (January 2021-December 2022) in HealthVerity using the data in its native and the standardized format, we descriptively compared cohort attrition and sample size, patient characteristics, and healthcare resource utilization during baseline and incidence of selected conditions after COVID-19 diagnosis.

Results: The standardized cohort included fewer patients than the native (164 445 vs. 198 317), but age (median 48 years) and sex (70% female) were the same. The distribution of race was similar; however, the standardized cohort mapped patients with "Other" race to the "Unknown/Missing" race category, which created differences among those categories. Distributions were similar, albeit slightly lower for comorbidities (differences < 1%), and lower for SARS-CoV-2 diagnostic tests (59% vs. 70%). Medical encounter counts were also lower, with substantial differences that were attenuated after limiting encounter counts to one event per day (e.g., mean count of 6.0 vs. 27.7 specialty care visits reduced to 2.9 vs. 3.5). Incidence rates were lower, with the greatest difference for hepatotoxicity (29.6 vs. 37.1 per 1000 person-years).

Conclusions: The data standardization refines the data (e.g., removes duplicate claims and variables or variable categories), which may reduce outliers and errors but yield lower distributions and counts of certain variables than observed in native format data. Therefore, it is critical to understand how standardization impacts the data and subsequently its fitness for use.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.80
自引率
7.70%
发文量
173
审稿时长
3 months
期刊介绍: The aim of Pharmacoepidemiology and Drug Safety is to provide an international forum for the communication and evaluation of data, methods and opinion in the discipline of pharmacoepidemiology. The Journal publishes peer-reviewed reports of original research, invited reviews and a variety of guest editorials and commentaries embracing scientific, medical, statistical, legal and economic aspects of pharmacoepidemiology and post-marketing surveillance of drug safety. Appropriate material in these categories may also be considered for publication as a Brief Report. Particular areas of interest include: design, analysis, results, and interpretation of studies looking at the benefit or safety of specific pharmaceuticals, biologics, or medical devices, including studies in pharmacovigilance, postmarketing surveillance, pharmacoeconomics, patient safety, molecular pharmacoepidemiology, or any other study within the broad field of pharmacoepidemiology; comparative effectiveness research relating to pharmaceuticals, biologics, and medical devices. Comparative effectiveness research is the generation and synthesis of evidence that compares the benefits and harms of alternative methods to prevent, diagnose, treat, and monitor a clinical condition, as these methods are truly used in the real world; methodologic contributions of relevance to pharmacoepidemiology, whether original contributions, reviews of existing methods, or tutorials for how to apply the methods of pharmacoepidemiology; assessments of harm versus benefit in drug therapy; patterns of drug utilization; relationships between pharmacoepidemiology and the formulation and interpretation of regulatory guidelines; evaluations of risk management plans and programmes relating to pharmaceuticals, biologics and medical devices.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信