Natural Language Mapping of Electrocardiogram Interpretations to a Standardized Ontology.

IF 1.8 4区医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Methods of Information in Medicine Pub Date : 2021-09-01 Epub Date: 2021-10-05 DOI:10.1055/s-0041-1736312

Richard H Epstein, Yuel-Kai Jean, Roman Dudaryk, Robert E Freundlich, Jeremy P Walco, Dorothee A Mueller, Shawn E Banks

{"title":"Natural Language Mapping of Electrocardiogram Interpretations to a Standardized Ontology.","authors":"Richard H Epstein, Yuel-Kai Jean, Roman Dudaryk, Robert E Freundlich, Jeremy P Walco, Dorothee A Mueller, Shawn E Banks","doi":"10.1055/s-0041-1736312","DOIUrl":null,"url":null,"abstract":"Background: Interpretations of the electrocardiogram (ECG) are often prepared using software outside the electronic health record (EHR) and imported via an interface as a narrative note. Thus, natural language processing is required to create a computable representation of the findings. Challenges include misspellings, nonstandard abbreviations, jargon, and equivocation in diagnostic interpretations.Objectives: Our objective was to develop an algorithm to reliably and efficiently extract such information and map it to the standardized ECG ontology developed jointly by the American Heart Association, the American College of Cardiology Foundation, and the Heart Rhythm Society. The algorithm was to be designed to be easily modifiable for use with EHRs and ECG reporting systems other than the ones studied.Methods: An algorithm using natural language processing techniques was developed in structured query language to extract and map quantitative and diagnostic information from ECG narrative reports to the cardiology societies' standardized ECG ontology. The algorithm was developed using a training dataset of 43,861 ECG reports and applied to a test dataset of 46,873 reports.Results: Accuracy, precision, recall, and the F1-measure were all 100% in the test dataset for the extraction of quantitative data (e.g., PR and QTc interval, atrial and ventricular heart rate). Performances for matches in each diagnostic category in the standardized ECG ontology were all above 99% in the test dataset. The processing speed was approximately 20,000 reports per minute. We externally validated the algorithm from another institution that used a different ECG reporting system and found similar performance.Conclusion: The developed algorithm had high performance for creating a computable representation of ECG interpretations. Software and lookup tables are provided that can easily be modified for local customization and for use with other EHR and ECG reporting systems. This algorithm has utility for research and in clinical decision-support where incorporation of ECG findings is desired.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"60 3-04","pages":"104-109"},"PeriodicalIF":1.8000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8595771/pdf/nihms-1752621.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods of Information in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/s-0041-1736312","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/10/5 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Interpretations of the electrocardiogram (ECG) are often prepared using software outside the electronic health record (EHR) and imported via an interface as a narrative note. Thus, natural language processing is required to create a computable representation of the findings. Challenges include misspellings, nonstandard abbreviations, jargon, and equivocation in diagnostic interpretations.

Objectives: Our objective was to develop an algorithm to reliably and efficiently extract such information and map it to the standardized ECG ontology developed jointly by the American Heart Association, the American College of Cardiology Foundation, and the Heart Rhythm Society. The algorithm was to be designed to be easily modifiable for use with EHRs and ECG reporting systems other than the ones studied.

Methods: An algorithm using natural language processing techniques was developed in structured query language to extract and map quantitative and diagnostic information from ECG narrative reports to the cardiology societies' standardized ECG ontology. The algorithm was developed using a training dataset of 43,861 ECG reports and applied to a test dataset of 46,873 reports.

Results: Accuracy, precision, recall, and the F1-measure were all 100% in the test dataset for the extraction of quantitative data (e.g., PR and QTc interval, atrial and ventricular heart rate). Performances for matches in each diagnostic category in the standardized ECG ontology were all above 99% in the test dataset. The processing speed was approximately 20,000 reports per minute. We externally validated the algorithm from another institution that used a different ECG reporting system and found similar performance.

Conclusion: The developed algorithm had high performance for creating a computable representation of ECG interpretations. Software and lookup tables are provided that can easily be modified for local customization and for use with other EHR and ECG reporting systems. This algorithm has utility for research and in clinical decision-support where incorporation of ECG findings is desired.

查看原文本刊更多论文

心电图解读与标准化本体的自然语言映射。

背景：心电图（ECG）的解读通常是使用电子健康记录（EHR）以外的软件准备的，并通过接口以叙述性笔记的形式导入。因此，需要进行自然语言处理，以创建可计算的结果表示法。面临的挑战包括拼写错误、非标准缩写、行话以及诊断解释中的模棱两可：我们的目标是开发一种算法，以可靠、高效地提取此类信息，并将其映射到由美国心脏协会、美国心脏病学会基金会和心律学会联合开发的标准化心电图本体。该算法的设计应易于修改，以便与电子病历和心电图报告系统（所研究的系统除外）一起使用：方法：使用结构化查询语言开发了一种使用自然语言处理技术的算法，以从心电图叙述性报告中提取定量和诊断信息并将其映射到心脏病学会的标准化心电图本体中。该算法使用 43,861 份心电图报告的训练数据集开发，并应用于 46,873 份报告的测试数据集：结果：在提取定量数据（如 PR 和 QTc 间期、心房和心室心率）的测试数据集中，准确率、精确率、召回率和 F1 测量值均为 100%。在测试数据集中，标准化心电图本体中每个诊断类别的匹配率均超过 99%。处理速度约为每分钟 20,000 份报告。我们从另一家使用不同心电图报告系统的机构对该算法进行了外部验证，发现性能相似：结论：所开发的算法在创建心电图解释的可计算表示方面具有很高的性能。所提供的软件和查找表可轻松进行本地定制修改，并可与其他电子病历和心电图报告系统一起使用。该算法可用于研究和临床决策支持，因为临床决策支持需要纳入心电图检查结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Methods of Information in Medicine 医学-计算机：信息系统

CiteScore

3.70

自引率

11.80%

发文量

审稿时长

6-12 weeks

期刊介绍： Good medicine and good healthcare demand good information. Since the journal''s founding in 1962, Methods of Information in Medicine has stressed the methodology and scientific fundamentals of organizing, representing and analyzing data, information and knowledge in biomedicine and health care. Covering publications in the fields of biomedical and health informatics, medical biometry, and epidemiology, the journal publishes original papers, reviews, reports, opinion papers, editorials, and letters to the editor. From time to time, the journal publishes articles on particular focus themes as part of a journal''s issue.