用于质量改进和研究的自动化电子临床数据采集:真实世界证据的特定验证项目。

Emily Beth Devine, Erik Van Eaton, Megan E Zadworny, Rebecca Symons, Allison Devlin, David Yanez, Meliha Yetisgen, Katelyn R Keyloun, Daniel Capurro, Rafael Alfonso-Cristancho, David R Flum, Peter Tarczy-Hornoch
{"title":"用于质量改进和研究的自动化电子临床数据采集:真实世界证据的特定验证项目。","authors":"Emily Beth Devine,&nbsp;Erik Van Eaton,&nbsp;Megan E Zadworny,&nbsp;Rebecca Symons,&nbsp;Allison Devlin,&nbsp;David Yanez,&nbsp;Meliha Yetisgen,&nbsp;Katelyn R Keyloun,&nbsp;Daniel Capurro,&nbsp;Rafael Alfonso-Cristancho,&nbsp;David R Flum,&nbsp;Peter Tarczy-Hornoch","doi":"10.5334/egems.211","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The availability of high fidelity electronic health record (EHR) data is a hallmark of the learning health care system. Washington State's Surgical Care Outcomes and Assessment Program (SCOAP) is a network of hospitals participating in quality improvement (QI) registries wherein data are manually abstracted from EHRs. To create the Comparative Effectiveness Research and Translation Network (CERTAIN), we semi-automated SCOAP data abstraction using a centralized federated data model, created a central data repository (CDR), and assessed whether these data could be used as real world evidence for QI and research.</p><p><strong>Objectives: </strong>Describe the validation processes and complexities involved and lessons learned.</p><p><strong>Methods: </strong>Investigators installed a commercial CDR to retrieve and store data from disparate EHRs. Manual and automated abstraction systems were conducted in parallel (10/2012-7/2013) and validated in three phases using the EHR as the gold standard: 1) ingestion, 2) standardization, and 3) concordance of automated versus manually abstracted cases. Information retrieval statistics were calculated.</p><p><strong>Results: </strong>Four unaffiliated health systems provided data. Between 6 and 15 percent of data elements were abstracted: 51 to 86 percent from structured data; the remainder using natural language processing (NLP). In phase 1, data ingestion from 12 out of 20 feeds reached 95 percent accuracy. In phase 2, 55 percent of structured data elements performed with 96 to 100 percent accuracy; NLP with 89 to 91 percent accuracy. In phase 3, concordance ranged from 69 to 89 percent. Information retrieval statistics were consistently above 90 percent.</p><p><strong>Conclusions: </strong>Semi-automated data abstraction may be useful, although raw data collected as a byproduct of health care delivery is not immediately available for use as real world evidence. New approaches to gathering and analyzing extant data are required.</p>","PeriodicalId":72880,"journal":{"name":"EGEMS (Washington, DC)","volume":" ","pages":"8"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5983060/pdf/","citationCount":"10","resultStr":"{\"title\":\"Automating Electronic Clinical Data Capture for Quality Improvement and Research: The CERTAIN Validation Project of Real World Evidence.\",\"authors\":\"Emily Beth Devine,&nbsp;Erik Van Eaton,&nbsp;Megan E Zadworny,&nbsp;Rebecca Symons,&nbsp;Allison Devlin,&nbsp;David Yanez,&nbsp;Meliha Yetisgen,&nbsp;Katelyn R Keyloun,&nbsp;Daniel Capurro,&nbsp;Rafael Alfonso-Cristancho,&nbsp;David R Flum,&nbsp;Peter Tarczy-Hornoch\",\"doi\":\"10.5334/egems.211\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The availability of high fidelity electronic health record (EHR) data is a hallmark of the learning health care system. Washington State's Surgical Care Outcomes and Assessment Program (SCOAP) is a network of hospitals participating in quality improvement (QI) registries wherein data are manually abstracted from EHRs. To create the Comparative Effectiveness Research and Translation Network (CERTAIN), we semi-automated SCOAP data abstraction using a centralized federated data model, created a central data repository (CDR), and assessed whether these data could be used as real world evidence for QI and research.</p><p><strong>Objectives: </strong>Describe the validation processes and complexities involved and lessons learned.</p><p><strong>Methods: </strong>Investigators installed a commercial CDR to retrieve and store data from disparate EHRs. Manual and automated abstraction systems were conducted in parallel (10/2012-7/2013) and validated in three phases using the EHR as the gold standard: 1) ingestion, 2) standardization, and 3) concordance of automated versus manually abstracted cases. Information retrieval statistics were calculated.</p><p><strong>Results: </strong>Four unaffiliated health systems provided data. Between 6 and 15 percent of data elements were abstracted: 51 to 86 percent from structured data; the remainder using natural language processing (NLP). In phase 1, data ingestion from 12 out of 20 feeds reached 95 percent accuracy. In phase 2, 55 percent of structured data elements performed with 96 to 100 percent accuracy; NLP with 89 to 91 percent accuracy. In phase 3, concordance ranged from 69 to 89 percent. Information retrieval statistics were consistently above 90 percent.</p><p><strong>Conclusions: </strong>Semi-automated data abstraction may be useful, although raw data collected as a byproduct of health care delivery is not immediately available for use as real world evidence. New approaches to gathering and analyzing extant data are required.</p>\",\"PeriodicalId\":72880,\"journal\":{\"name\":\"EGEMS (Washington, DC)\",\"volume\":\" \",\"pages\":\"8\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-05-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5983060/pdf/\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"EGEMS (Washington, DC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5334/egems.211\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"EGEMS (Washington, DC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5334/egems.211","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

背景:高保真电子健康记录(EHR)数据的可用性是学习型医疗保健系统的一个标志。华盛顿州的外科护理结果和评估计划(SCOAP)是一个参与质量改进(QI)注册的医院网络,其中的数据是手动从电子病历中提取的。为了创建比较有效性研究和翻译网络(CERTAIN),我们使用集中式联邦数据模型对SCOAP数据抽象进行了半自动化,创建了一个中央数据存储库(CDR),并评估了这些数据是否可以用作QI和研究的真实世界证据。目标:描述验证过程、复杂性和经验教训。方法:研究人员安装了商业CDR来检索和存储来自不同电子病历的数据。手动和自动抽象系统并行进行(2012年10月- 2013年7月),并以EHR为金标准分三个阶段进行验证:1)摄取,2)标准化,3)自动与手动抽象案例的一致性。计算信息检索统计。结果:四个独立的卫生系统提供了数据。6%到15%的数据元素被抽象:51%到86%来自结构化数据;其余的使用自然语言处理(NLP)。在第一阶段,从20个提要中的12个提要中获取的数据达到了95%的准确率。在第二阶段,55%的结构化数据元素以96%到100%的准确率执行;NLP有89%到91%的准确率。在第三阶段,一致性从69%到89%不等。信息检索统计数据始终在90%以上。结论:半自动化的数据抽象可能是有用的,尽管作为卫生保健提供的副产品收集的原始数据不能立即用作现实世界的证据。需要收集和分析现有数据的新方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Automating Electronic Clinical Data Capture for Quality Improvement and Research: The CERTAIN Validation Project of Real World Evidence.

Background: The availability of high fidelity electronic health record (EHR) data is a hallmark of the learning health care system. Washington State's Surgical Care Outcomes and Assessment Program (SCOAP) is a network of hospitals participating in quality improvement (QI) registries wherein data are manually abstracted from EHRs. To create the Comparative Effectiveness Research and Translation Network (CERTAIN), we semi-automated SCOAP data abstraction using a centralized federated data model, created a central data repository (CDR), and assessed whether these data could be used as real world evidence for QI and research.

Objectives: Describe the validation processes and complexities involved and lessons learned.

Methods: Investigators installed a commercial CDR to retrieve and store data from disparate EHRs. Manual and automated abstraction systems were conducted in parallel (10/2012-7/2013) and validated in three phases using the EHR as the gold standard: 1) ingestion, 2) standardization, and 3) concordance of automated versus manually abstracted cases. Information retrieval statistics were calculated.

Results: Four unaffiliated health systems provided data. Between 6 and 15 percent of data elements were abstracted: 51 to 86 percent from structured data; the remainder using natural language processing (NLP). In phase 1, data ingestion from 12 out of 20 feeds reached 95 percent accuracy. In phase 2, 55 percent of structured data elements performed with 96 to 100 percent accuracy; NLP with 89 to 91 percent accuracy. In phase 3, concordance ranged from 69 to 89 percent. Information retrieval statistics were consistently above 90 percent.

Conclusions: Semi-automated data abstraction may be useful, although raw data collected as a byproduct of health care delivery is not immediately available for use as real world evidence. New approaches to gathering and analyzing extant data are required.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信