Using Natural Language Processing for Automated Classification of Disease and to Identify Misclassified ICD Codes in Cardiac Disease

M. Falter, D. Godderis, M. Scherrenberg, S. Kizilkilic, Linqi Xu, Marc Mertens, Jan Jansen, Pascal Legroux, H. Kindermans, Peter Sinnaeve, Frank Neven, P. Dendale
{"title":"Using Natural Language Processing for Automated Classification of Disease and to Identify Misclassified ICD Codes in Cardiac Disease","authors":"M. Falter, D. Godderis, M. Scherrenberg, S. Kizilkilic, Linqi Xu, Marc Mertens, Jan Jansen, Pascal Legroux, H. Kindermans, Peter Sinnaeve, Frank Neven, P. Dendale","doi":"10.1093/ehjdh/ztae008","DOIUrl":null,"url":null,"abstract":"\n \n \n ICD-codes are used for classification of hospitalisations. The codes are used for administrative, financial and research purposes. It is known however that errors occur. Natural language processing (NLP) offers promising solutions for optimising the process.\n \n \n \n To investigate methods for automatic classification of disease in unstructured medical records using NLP and to compare these to conventional ICD coding.\n \n \n \n Two datasets were used: the open-source MIMIC-III dataset (n = 55.177) and a dataset from a hospital in Belgium (n = 12.706). Automated searches using NLP algorithms were performed for the diagnoses “atrial fibrillation” and “heart failure”. Four methods were used: rule-based search, logistic regression, term frequency-inverse document frequency (TF-IDF), XGBoost and BioBERT. All algorithms were developed on the MIMIC-III dataset. The best performing algorithm was then deployed on the Belgian dataset.\n \n \n \n After pre-processing a total of 1.438 reports was retained in the Belgian dataset. XGBoost on TF-IDF matrix resulted in an accuracy of 0.94 and 0.92 for AF and HF respectively. There were 211 mismatches between algorithm and ICD codes. 103 were due to a difference in data availability or differing definitions. In the remaining 108 mismatches, 70% were due to incorrect labelling by the algorithm and 30% were due to erroneous ICD-coding (2% of total hospitalisations).\n \n \n \n A newly developed NLP algorithm attained a high accuracy for classifying disease in medical records. XGBoost outperformed the deep learning technique BioBERT. NLP algorithms could be used to identify ICD-coding errors and optimise and support the ICD-coding process.\n","PeriodicalId":508387,"journal":{"name":"European Heart Journal - Digital Health","volume":"411 10","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Heart Journal - Digital Health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/ehjdh/ztae008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

ICD-codes are used for classification of hospitalisations. The codes are used for administrative, financial and research purposes. It is known however that errors occur. Natural language processing (NLP) offers promising solutions for optimising the process. To investigate methods for automatic classification of disease in unstructured medical records using NLP and to compare these to conventional ICD coding. Two datasets were used: the open-source MIMIC-III dataset (n = 55.177) and a dataset from a hospital in Belgium (n = 12.706). Automated searches using NLP algorithms were performed for the diagnoses “atrial fibrillation” and “heart failure”. Four methods were used: rule-based search, logistic regression, term frequency-inverse document frequency (TF-IDF), XGBoost and BioBERT. All algorithms were developed on the MIMIC-III dataset. The best performing algorithm was then deployed on the Belgian dataset. After pre-processing a total of 1.438 reports was retained in the Belgian dataset. XGBoost on TF-IDF matrix resulted in an accuracy of 0.94 and 0.92 for AF and HF respectively. There were 211 mismatches between algorithm and ICD codes. 103 were due to a difference in data availability or differing definitions. In the remaining 108 mismatches, 70% were due to incorrect labelling by the algorithm and 30% were due to erroneous ICD-coding (2% of total hospitalisations). A newly developed NLP algorithm attained a high accuracy for classifying disease in medical records. XGBoost outperformed the deep learning technique BioBERT. NLP algorithms could be used to identify ICD-coding errors and optimise and support the ICD-coding process.
利用自然语言处理技术自动进行疾病分类并识别心脏病中分类错误的 ICD 代码
ICD 代码用于住院分类。这些代码用于行政、财务和研究目的。但众所周知,错误时有发生。自然语言处理(NLP)为优化这一过程提供了有前景的解决方案。 研究使用 NLP 对非结构化医疗记录中的疾病进行自动分类的方法,并将这些方法与传统的 ICD 编码方法进行比较。 研究使用了两个数据集:开源的 MIMIC-III 数据集(n = 55.177)和比利时一家医院的数据集(n = 12.706)。使用 NLP 算法对 "心房颤动 "和 "心力衰竭 "这两个诊断进行了自动搜索。使用了四种方法:基于规则的搜索、逻辑回归、词频-反文档频率(TF-IDF)、XGBoost 和 BioBERT。所有算法都是在 MIMIC-III 数据集上开发的。然后在比利时数据集上部署了性能最好的算法。 经过预处理后,比利时数据集中共保留了 1.438 份报告。对 TF-IDF 矩阵进行 XGBoost 计算后,房颤和高频的准确率分别为 0.94 和 0.92。算法与 ICD 代码之间有 211 处不匹配。其中 103 个是由于数据可用性不同或定义不同造成的。其余108例不匹配中,70%是由于算法标记错误,30%是由于ICD编码错误(占住院总人数的2%)。 新开发的 NLP 算法对医疗记录中的疾病进行分类的准确率很高。XGBoost 的表现优于深度学习技术 BioBERT。NLP 算法可用于识别 ICD 编码错误,优化并支持 ICD 编码流程。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信