Using a Large Open Clinical Corpus for Improved ICD-10 Diagnosis Coding.

AMIA ... Annual Symposium proceedings. AMIA Symposium Pub Date : 2024-01-11 eCollection Date: 2023-01-01

Anastasios Lamproudis, Therese Olsen Svenning, Torbjørn Torsvik, Taridzo Chomutare, Andrius Budrionis, Phuong Dinh Ngo, Thomas Vakili, Hercules Dalianis

引用次数: 0

Abstract

With the recent advances in natural language processing and deep learning, the development of tools that can assist medical coders in ICD-10 diagnosis coding and increase their efficiency in coding discharge summaries is significantly more viable than before. To that end, one important component in the development of these models is the datasets used to train them. In this study, such datasets are presented, and it is shown that one of them can be used to develop a BERT-based language model that can consistently perform well in assigning ICD-10 codes to discharge summaries written in Swedish. Most importantly, it can be used in a coding support setup where a tool can recommend potential codes to the coders. This reduces the range of potential codes to consider and, in turn, reduces the workload of the coder. Moreover, the de-identified and pseudonymised dataset is open to use for academic users.

本刊更多论文

使用大型开放式临床语料库改进 ICD-10 诊断编码。

随着自然语言处理和深度学习领域的最新进展，开发可协助医疗编码员进行 ICD-10 诊断编码并提高其出院摘要编码效率的工具比以前更加可行。为此，开发这些模型的一个重要组成部分就是用于训练这些模型的数据集。本研究介绍了此类数据集，结果表明，其中一个数据集可用于开发基于 BERT 的语言模型，该模型在为用瑞典语撰写的出院摘要分配 ICD-10 代码方面一直表现出色。最重要的是，该模型可用于编码支持设置，其中一个工具可向编码员推荐潜在的编码。这就减少了需要考虑的潜在编码范围，从而减轻了编码员的工作量。此外，去标识化和假名化的数据集可供学术用户使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

AMIA ... Annual Symposium proceedings. AMIA Symposium

自引率

0.00%

发文量