The Secondary Use of Electronic Health Records for Data Mining: Data Characteristics and Challenges

Tabinda Sarwar, S. Seifollahi, Jeffrey A Chan, Xiuzhen Zhang, V. Aksakalli, I. Hudson, Karin M. Verspoor, L. Cavedon
{"title":"The Secondary Use of Electronic Health Records for Data Mining: Data Characteristics and Challenges","authors":"Tabinda Sarwar, S. Seifollahi, Jeffrey A Chan, Xiuzhen Zhang, V. Aksakalli, I. Hudson, Karin M. Verspoor, L. Cavedon","doi":"10.1145/3490234","DOIUrl":null,"url":null,"abstract":"The primary objective of implementing Electronic Health Records (EHRs) is to improve the management of patients’ health-related information. However, these records have also been extensively used for the secondary purpose of clinical research and to improve healthcare practice. EHRs provide a rich set of information that includes demographics, medical history, medications, laboratory test results, and diagnosis. Data mining and analytics techniques have extensively exploited EHR information to study patient cohorts for various clinical and research applications, such as phenotype extraction, precision medicine, intervention evaluation, disease prediction, detection, and progression. But the presence of diverse data types and associated characteristics poses many challenges to the use of EHR data. In this article, we provide an overview of information found in EHR systems and their characteristics that could be utilized for secondary applications. We first discuss the different types of data stored in EHRs, followed by the data transformations necessary for data analysis and mining. Later, we discuss the data quality issues and characteristics of the EHRs along with the relevant methods used to address them. Moreover, this survey also highlights the usage of various data types for different applications. Hence, this article can serve as a primer for researchers to understand the use of EHRs for data mining and analytics purposes.","PeriodicalId":7000,"journal":{"name":"ACM Computing Surveys (CSUR)","volume":"8 1","pages":"1 - 40"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys (CSUR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3490234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

Abstract

The primary objective of implementing Electronic Health Records (EHRs) is to improve the management of patients’ health-related information. However, these records have also been extensively used for the secondary purpose of clinical research and to improve healthcare practice. EHRs provide a rich set of information that includes demographics, medical history, medications, laboratory test results, and diagnosis. Data mining and analytics techniques have extensively exploited EHR information to study patient cohorts for various clinical and research applications, such as phenotype extraction, precision medicine, intervention evaluation, disease prediction, detection, and progression. But the presence of diverse data types and associated characteristics poses many challenges to the use of EHR data. In this article, we provide an overview of information found in EHR systems and their characteristics that could be utilized for secondary applications. We first discuss the different types of data stored in EHRs, followed by the data transformations necessary for data analysis and mining. Later, we discuss the data quality issues and characteristics of the EHRs along with the relevant methods used to address them. Moreover, this survey also highlights the usage of various data types for different applications. Hence, this article can serve as a primer for researchers to understand the use of EHRs for data mining and analytics purposes.
电子健康记录在数据挖掘中的二次使用:数据特征和挑战
实施电子健康记录(EHRs)的主要目标是改善对患者健康相关信息的管理。然而,这些记录也被广泛用于临床研究和改善医疗保健实践的次要目的。电子病历提供了一组丰富的信息,包括人口统计、病史、药物、实验室测试结果和诊断。数据挖掘和分析技术已经广泛利用电子病历信息来研究各种临床和研究应用的患者群体,如表型提取、精准医学、干预评估、疾病预测、检测和进展。但是,各种数据类型和相关特征的存在给电子病历数据的使用带来了许多挑战。在本文中,我们概述了在EHR系统中发现的信息及其可用于辅助应用程序的特征。我们首先讨论存储在ehr中的不同类型的数据,然后讨论数据分析和挖掘所需的数据转换。稍后,我们将讨论电子病历的数据质量问题和特征,以及用于解决这些问题的相关方法。此外,该调查还强调了不同应用程序对不同数据类型的使用。因此,本文可以作为研究人员了解电子病历用于数据挖掘和分析目的的入门读物。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信