通过数据来源评估结构化电子健康记录数据与参考术语和数据完整性的协调性

IF 2.6 Q2 HEALTH POLICY & SERVICES

Learning Health Systems Pub Date : 2024-10-21 DOI:10.1002/lrh2.10468

Keith Marsolo, Lesley Curtis, Laura Qualls, Jennifer Xu, Yinghong Zhang, Thomas Phillips, C. Larry Hill, Gretchen Sanders, Judith C. Maro, Daniel Kiernan, Christine Draper, Kevin Coughlin, Sarah K. Dutcher, José J. Hernández-Muñoz, Monique Falconer

{"title":"通过数据来源评估结构化电子健康记录数据与参考术语和数据完整性的协调性","authors":"Keith Marsolo, Lesley Curtis, Laura Qualls, Jennifer Xu, Yinghong Zhang, Thomas Phillips, C. Larry Hill, Gretchen Sanders, Judith C. Maro, Daniel Kiernan, Christine Draper, Kevin Coughlin, Sarah K. Dutcher, José J. Hernández-Muñoz, Monique Falconer","doi":"10.1002/lrh2.10468","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Introduction</h3>\n \n <p>(1) Assess the harmonization of structured electronic health record data (laboratory results and medications) to reference terminologies and characterize the severity of issues. (2) Identify issues of data completeness by comparing complementary data domains, stratifying by time, care setting, and provenance.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>Queries were distributed to 3 Data Partners (DP). Using harmonization queries, we examined the top 200 laboratory results and medications by volume, identifying outliers and computing summary statistics. The completeness queries looked at 4 conditions of interest and related clinical concepts. Counts were generated for each condition, stratified by year, encounter type, and provenance. We analyzed trends over time within and across DPs.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>We found that the median number of codes associated with a given laboratory/medication name (and vice versa) generally met expectations, though there were DP-specific issues that resulted in outliers. In addition, there were drastic differences in the percentage of patients with a given concept depending on provenance.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>The harmonization queries surfaced several mapping errors, as well as issues with overly specific codes and records with “null” codes. The completeness queries demonstrated having access to multiple types of data provenance provides more robust results compared with any single provenance type. Harmonization errors between source data and reference terminologies may not be widespread but do exist within CDMs, affecting tens of thousands or even millions of records. Provenance information can help identify potential completeness issues with EHR data, but only if it is represented in the CDM and then populated by DPs.</p>\n </section>\n </div>","PeriodicalId":43916,"journal":{"name":"Learning Health Systems","volume":"9 2","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/lrh2.10468","citationCount":"0","resultStr":"{\"title\":\"Assessing the harmonization of structured electronic health record data to reference terminologies and data completeness through data provenance\",\"authors\":\"Keith Marsolo, Lesley Curtis, Laura Qualls, Jennifer Xu, Yinghong Zhang, Thomas Phillips, C. Larry Hill, Gretchen Sanders, Judith C. Maro, Daniel Kiernan, Christine Draper, Kevin Coughlin, Sarah K. Dutcher, José J. Hernández-Muñoz, Monique Falconer\",\"doi\":\"10.1002/lrh2.10468\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Introduction</h3>\\n \\n <p>(1) Assess the harmonization of structured electronic health record data (laboratory results and medications) to reference terminologies and characterize the severity of issues. (2) Identify issues of data completeness by comparing complementary data domains, stratifying by time, care setting, and provenance.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>Queries were distributed to 3 Data Partners (DP). Using harmonization queries, we examined the top 200 laboratory results and medications by volume, identifying outliers and computing summary statistics. The completeness queries looked at 4 conditions of interest and related clinical concepts. Counts were generated for each condition, stratified by year, encounter type, and provenance. We analyzed trends over time within and across DPs.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>We found that the median number of codes associated with a given laboratory/medication name (and vice versa) generally met expectations, though there were DP-specific issues that resulted in outliers. In addition, there were drastic differences in the percentage of patients with a given concept depending on provenance.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>The harmonization queries surfaced several mapping errors, as well as issues with overly specific codes and records with “null” codes. The completeness queries demonstrated having access to multiple types of data provenance provides more robust results compared with any single provenance type. Harmonization errors between source data and reference terminologies may not be widespread but do exist within CDMs, affecting tens of thousands or even millions of records. Provenance information can help identify potential completeness issues with EHR data, but only if it is represented in the CDM and then populated by DPs.</p>\\n </section>\\n </div>\",\"PeriodicalId\":43916,\"journal\":{\"name\":\"Learning Health Systems\",\"volume\":\"9 2\",\"pages\":\"\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/lrh2.10468\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Learning Health Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/lrh2.10468\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH POLICY & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Learning Health Systems","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/lrh2.10468","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH POLICY & SERVICES","Score":null,"Total":0}

引用次数: 0

摘要

(1)评估结构化电子健康记录数据（实验室结果和药物）的一致性，以参考术语并描述问题的严重程度。(2)通过比较互补数据域，按时间、护理环境和来源分层，确定数据完整性问题。方法对3个数据伙伴（DP）进行问卷调查。使用协调查询，我们按体积检查了前200名的实验室结果和药物，确定了异常值并计算了汇总统计数据。完整性查询着眼于4种感兴趣的条件和相关的临床概念。对每种情况进行计数，按年份、遭遇类型和来源分层。我们分析了dp内部和dp之间的趋势。结果我们发现，与给定实验室/药物名称相关的代码中位数（反之亦然）通常符合预期，尽管存在导致异常值的dp特定问题。此外，根据来源不同，具有给定概念的患者百分比也有很大差异。协调查询出现了一些映射错误，以及过于具体的代码和“空”代码记录的问题。与任何单一来源类型相比，具有访问多种类型数据来源的完整性查询提供了更健壮的结果。源数据和参考术语之间的协调错误可能并不普遍，但在cdm中确实存在，影响到数万甚至数百万条记录。来源信息可以帮助识别EHR数据的潜在完整性问题，但前提是它在CDM中表示，然后由dp填充。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Assessing the harmonization of structured electronic health record data to reference terminologies and data completeness through data provenance

查看原文本刊更多论文

Assessing the harmonization of structured electronic health record data to reference terminologies and data completeness through data provenance

Introduction

(1) Assess the harmonization of structured electronic health record data (laboratory results and medications) to reference terminologies and characterize the severity of issues. (2) Identify issues of data completeness by comparing complementary data domains, stratifying by time, care setting, and provenance.

Methods

Queries were distributed to 3 Data Partners (DP). Using harmonization queries, we examined the top 200 laboratory results and medications by volume, identifying outliers and computing summary statistics. The completeness queries looked at 4 conditions of interest and related clinical concepts. Counts were generated for each condition, stratified by year, encounter type, and provenance. We analyzed trends over time within and across DPs.

Results

We found that the median number of codes associated with a given laboratory/medication name (and vice versa) generally met expectations, though there were DP-specific issues that resulted in outliers. In addition, there were drastic differences in the percentage of patients with a given concept depending on provenance.

Conclusions

The harmonization queries surfaced several mapping errors, as well as issues with overly specific codes and records with “null” codes. The completeness queries demonstrated having access to multiple types of data provenance provides more robust results compared with any single provenance type. Harmonization errors between source data and reference terminologies may not be widespread but do exist within CDMs, affecting tens of thousands or even millions of records. Provenance information can help identify potential completeness issues with EHR data, but only if it is represented in the CDM and then populated by DPs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Learning Health Systems HEALTH POLICY & SERVICES-

CiteScore

5.60

自引率

22.60%

发文量

审稿时长

20 weeks