{"title":"Semantics-driven improvements in electronic health records data quality: a systematic review.","authors":"Yirong Wu, Mudan Ren, Na Chen, Liu Yang","doi":"10.1186/s12911-025-03146-w","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Data quality (DQ) of electronic health record (EHR) is crucial for the advancement of health informatization, yet it remains a significant challenge. Scholars are showing a growing interest in leveraging semantic technologies to enhance EHR data quality. However, previous studies have focused predominantly on specific semantic technologies, scenarios, or objectives-such as interoperability-often overlooking the potential of a various semantic technologies across different scenarios.</p><p><strong>Objective: </strong>This systematic review aimed to explore the potential of employing a range of semantic technologies to improve EHR data quality in a broader spectrum of application scenarios.</p><p><strong>Methods: </strong>Our systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Three databases were searched, including PubMed, IEEE Xplore, and Web of Science Core Collection. The search terms used included \"Semantic*\", \"Quality\", \"Electronic Health Record*\", \"EHR*\", \"Electronic Medical Record*\", and \"EMR*\". These terms were combined via various Boolean operators to formulate multiple search queries.</p><p><strong>Results: </strong>Thirty-seven papers that met the inclusion criteria between 2008 and 2024 were analyzed. Six semantic techniques were identified as instrumental in improving EHR DQ: EHR standardization, controlled vocabulary, ontology, semantic web, knowledge graph, and natural language processing (NLP). These technologies were further mapped to 16 core data quality indicators and the FAIR principles (Findable, Accessible, Interoperable, and Reusable), highlighting their contributions across both technical and governance dimensions.</p><p><strong>Conclusions: </strong>The six identified semantic technologies can be categorized into three levels: foundational, general, and advanced. These technologies show significant potential in enhancing EHR DQ, particularly in the areas of conformance, portability, usability, and applicability, and they are suitable for a variety of contexs beyond interoperability, aligning with FAIR-aligned best practices in data management and reuse.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"298"},"PeriodicalIF":3.8000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12337493/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03146-w","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Data quality (DQ) of electronic health record (EHR) is crucial for the advancement of health informatization, yet it remains a significant challenge. Scholars are showing a growing interest in leveraging semantic technologies to enhance EHR data quality. However, previous studies have focused predominantly on specific semantic technologies, scenarios, or objectives-such as interoperability-often overlooking the potential of a various semantic technologies across different scenarios.
Objective: This systematic review aimed to explore the potential of employing a range of semantic technologies to improve EHR data quality in a broader spectrum of application scenarios.
Methods: Our systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Three databases were searched, including PubMed, IEEE Xplore, and Web of Science Core Collection. The search terms used included "Semantic*", "Quality", "Electronic Health Record*", "EHR*", "Electronic Medical Record*", and "EMR*". These terms were combined via various Boolean operators to formulate multiple search queries.
Results: Thirty-seven papers that met the inclusion criteria between 2008 and 2024 were analyzed. Six semantic techniques were identified as instrumental in improving EHR DQ: EHR standardization, controlled vocabulary, ontology, semantic web, knowledge graph, and natural language processing (NLP). These technologies were further mapped to 16 core data quality indicators and the FAIR principles (Findable, Accessible, Interoperable, and Reusable), highlighting their contributions across both technical and governance dimensions.
Conclusions: The six identified semantic technologies can be categorized into three levels: foundational, general, and advanced. These technologies show significant potential in enhancing EHR DQ, particularly in the areas of conformance, portability, usability, and applicability, and they are suitable for a variety of contexs beyond interoperability, aligning with FAIR-aligned best practices in data management and reuse.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.