Designing NLP applications to support ICD coding: an impact analysis and guidelines to enhance baseline performance when processing patient discharge notes

Jessica Jha, Mario Almagro, Hegler Tissot
{"title":"Designing NLP applications to support ICD coding: an impact analysis and guidelines to enhance baseline performance when processing patient discharge notes","authors":"Jessica Jha, Mario Almagro, Hegler Tissot","doi":"10.55976/jdh.22023119463-81","DOIUrl":null,"url":null,"abstract":"Financial costs are a major concern in the healthcare system, with medical billing and coding playing a key role in facilitating transactions and financing procedures. Billing involves filing claims with insurance companies and requires scrutiny of clinical summaries and electronic health records to correctly match diagnoses, prescriptions, and procedures to standardized codes. Accuracy in assigning International Classification of Diseases (ICD) codes is critical to proper reimbursement of care. Incorrect codes waste time and resources, and cause administrative and financial problems for hospitals, insurance companies and patients. Manual medical coding is a labor-intensive and error-prone process that creates additional administrative burden and inconvenience for hospitals, insurance companies, and patients. To simplify the process, clinical records are often processed to automatically identify and extract clinical concepts and corresponding ICD codes. Deep learning and natural language processing techniques have shown promise in a variety of tasks but applying them to medical coding has been challenging. Accurate coding requires a deep understanding of medical terminology, context, and guidelines that may be difficult to capture with traditional deep learning methods. Although deep learning shows promise in healthcare, its specific impact on ICD coding is not fully understood, and translating scalable deep learning methods into practical improvements in ICD coding remains a challenge. Evaluating deep learning models under the scenarios of real-world coding and comparing them to established practice is critical to determining their true effectiveness. In this work, we address the automation of ICD coding by highlighting pitfalls and contrasting different perspectives. We investigated automatic ICD coding using baseline machine learning models, with a focus on identifying ICD-9 codes in discharge notes from Medical Information Mart for Intensive Care (MIMIC) database. A thorough evaluation of different models and approaches is crucial to avoid over-reliance on any method. Our findings show that simpler methods can achieve comparable results to deep learning models while still requiring fewer computational resources.","PeriodicalId":131334,"journal":{"name":"Journal of Digital Health","volume":"25 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Digital Health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.55976/jdh.22023119463-81","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Financial costs are a major concern in the healthcare system, with medical billing and coding playing a key role in facilitating transactions and financing procedures. Billing involves filing claims with insurance companies and requires scrutiny of clinical summaries and electronic health records to correctly match diagnoses, prescriptions, and procedures to standardized codes. Accuracy in assigning International Classification of Diseases (ICD) codes is critical to proper reimbursement of care. Incorrect codes waste time and resources, and cause administrative and financial problems for hospitals, insurance companies and patients. Manual medical coding is a labor-intensive and error-prone process that creates additional administrative burden and inconvenience for hospitals, insurance companies, and patients. To simplify the process, clinical records are often processed to automatically identify and extract clinical concepts and corresponding ICD codes. Deep learning and natural language processing techniques have shown promise in a variety of tasks but applying them to medical coding has been challenging. Accurate coding requires a deep understanding of medical terminology, context, and guidelines that may be difficult to capture with traditional deep learning methods. Although deep learning shows promise in healthcare, its specific impact on ICD coding is not fully understood, and translating scalable deep learning methods into practical improvements in ICD coding remains a challenge. Evaluating deep learning models under the scenarios of real-world coding and comparing them to established practice is critical to determining their true effectiveness. In this work, we address the automation of ICD coding by highlighting pitfalls and contrasting different perspectives. We investigated automatic ICD coding using baseline machine learning models, with a focus on identifying ICD-9 codes in discharge notes from Medical Information Mart for Intensive Care (MIMIC) database. A thorough evaluation of different models and approaches is crucial to avoid over-reliance on any method. Our findings show that simpler methods can achieve comparable results to deep learning models while still requiring fewer computational resources.
设计支持ICD编码的NLP应用程序:在处理患者出院记录时提高基线性能的影响分析和指南
财务成本是医疗保健系统中的一个主要问题,医疗账单和编码在促进交易和融资程序方面发挥着关键作用。账单包括向保险公司提交索赔,需要仔细检查临床摘要和电子健康记录,以便将诊断、处方和程序与标准化代码正确匹配。准确分配国际疾病分类(ICD)代码对于适当的医疗报销至关重要。不正确的代码浪费时间和资源,并给医院、保险公司和患者造成行政和财务问题。手动医疗编码是一个劳动密集型且容易出错的过程,会给医院、保险公司和患者带来额外的管理负担和不便。为了简化流程,通常会对临床记录进行处理,自动识别和提取临床概念和相应的ICD代码。深度学习和自然语言处理技术已经在各种任务中显示出前景,但将它们应用于医疗编码一直具有挑战性。准确的编码需要对医学术语、上下文和指南有深刻的理解,而传统的深度学习方法可能难以做到这一点。尽管深度学习在医疗保健方面显示出前景,但其对ICD编码的具体影响尚未完全了解,并且将可扩展的深度学习方法转化为ICD编码的实际改进仍然是一个挑战。在现实世界的编码场景下评估深度学习模型,并将其与已建立的实践进行比较,对于确定其真正的有效性至关重要。在这项工作中,我们通过强调陷阱和对比不同的观点来解决ICD编码的自动化问题。我们使用基线机器学习模型研究了自动ICD编码,重点是识别重症监护医疗信息市场(MIMIC)数据库中出院记录中的ICD-9代码。对不同的模型和方法进行彻底的评估是避免过度依赖任何方法的关键。我们的研究结果表明,更简单的方法可以达到与深度学习模型相当的结果,同时仍然需要更少的计算资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信