Artificial Intelligence to Improve Clinical Coding Practice in Scandinavia: Crossover Randomized Controlled Trial.

IF 5.8 2区医学 Q1 HEALTH CARE SCIENCES & SERVICES

Journal of Medical Internet Research Pub Date : 2025-07-03 DOI:10.2196/71904

Taridzo Chomutare, Therese Olsen Svenning, Miguel Ángel Tejedor Hernández, Phuong Dinh Ngo, Andrius Budrionis, Kaisa Markljung, Lill Irene Hind, Torbjørn Torsvik, Karl Øyvind Mikalsen, Aleksandar Babic, Hercules Dalianis

{"title":"Artificial Intelligence to Improve Clinical Coding Practice in Scandinavia: Crossover Randomized Controlled Trial.","authors":"Taridzo Chomutare, Therese Olsen Svenning, Miguel Ángel Tejedor Hernández, Phuong Dinh Ngo, Andrius Budrionis, Kaisa Markljung, Lill Irene Hind, Torbjørn Torsvik, Karl Øyvind Mikalsen, Aleksandar Babic, Hercules Dalianis","doi":"10.2196/71904","DOIUrl":null,"url":null,"abstract":"Background: Clinical coding is critical for hospital reimbursement, quality assessment, and health care planning. In Scandinavia, however, coding is often done by junior doctors or medical secretaries, leading to high rates of coding errors. Artificial intelligence (AI) tools, particularly semiautomatic computer-assisted coding tools, have the potential to reduce the excessive burden of administrative and clinical documentation. To date, much of what we know regarding these tools comes from lab-based evaluations, which often fail to account for real-world complexity and variability in clinical text.Objective: This study aims to investigate whether an AI tool developed by by Norwegian Centre for E-health Research at the University Hospital of North Norway, Easy-ICD (International Classification of Diseases), can enhance clinical coding practices by reducing coding time and improving data quality in a realistic setting. We specifically examined whether improvements differ between long and short clinical notes, defined by word count.Methods: An AI tool, Easy-ICD, was developed to assist clinical coders and was tested for improving both accuracy and time in a 1:1 crossover randomized controlled trial conducted in Sweden and Norway. Participants were randomly assigned to 2 groups (Sequence AB or BA), and crossed over between coding longer texts (Period 1; mean 307, SD 90; words) versus shorter texts (Period 2; mean 166, SD 55; words), while using our tool versus not using our tool. This was a purely web-based trial, where participants were recruited through email. Coding time and accuracy were logged and analyzed using Mann-Whitney U tests for each of the 2 periods independently, due to differing text lengths in each period.Unlabelled: The trial had 17 participants enrolled, but only data from 15 participants (300 coded notes) were analyzed, excluding 2 incomplete records. Based on the Mann-Whitney U test, the median coding time difference for longer clinical text sequences was 123 seconds (P<.001, 95% CI 81-164), representing a 46% reduction in median coding time when our tool was used. For shorter clinical notes, the median time difference of 11 seconds was not significant (P=.25, 95% CI -34 to 8). Coding accuracy improved with Easy-ICD for both longer (62% vs 67%) and shorter clinical notes (60% vs 70%), but these differences were not statistically significant (P=.50and P=.17, respectively). User satisfaction ratings (submitted for 37% of cases) showed slightly higher approval for the tool's suggestions on longer clinical notes.Conclusions: This study demonstrates the potential of AI to transform common tasks in clinical workflows, with ostensible positive impacts on work efficiencies for clinical coding tasks with more demanding longer text sequences. Further studies within hospital workflows are required before these presumed impacts can be more clearly understood.","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e71904"},"PeriodicalIF":5.8000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12244276/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Internet Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/71904","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Clinical coding is critical for hospital reimbursement, quality assessment, and health care planning. In Scandinavia, however, coding is often done by junior doctors or medical secretaries, leading to high rates of coding errors. Artificial intelligence (AI) tools, particularly semiautomatic computer-assisted coding tools, have the potential to reduce the excessive burden of administrative and clinical documentation. To date, much of what we know regarding these tools comes from lab-based evaluations, which often fail to account for real-world complexity and variability in clinical text.

Objective: This study aims to investigate whether an AI tool developed by by Norwegian Centre for E-health Research at the University Hospital of North Norway, Easy-ICD (International Classification of Diseases), can enhance clinical coding practices by reducing coding time and improving data quality in a realistic setting. We specifically examined whether improvements differ between long and short clinical notes, defined by word count.

Methods: An AI tool, Easy-ICD, was developed to assist clinical coders and was tested for improving both accuracy and time in a 1:1 crossover randomized controlled trial conducted in Sweden and Norway. Participants were randomly assigned to 2 groups (Sequence AB or BA), and crossed over between coding longer texts (Period 1; mean 307, SD 90; words) versus shorter texts (Period 2; mean 166, SD 55; words), while using our tool versus not using our tool. This was a purely web-based trial, where participants were recruited through email. Coding time and accuracy were logged and analyzed using Mann-Whitney U tests for each of the 2 periods independently, due to differing text lengths in each period.

Unlabelled: The trial had 17 participants enrolled, but only data from 15 participants (300 coded notes) were analyzed, excluding 2 incomplete records. Based on the Mann-Whitney U test, the median coding time difference for longer clinical text sequences was 123 seconds (P<.001, 95% CI 81-164), representing a 46% reduction in median coding time when our tool was used. For shorter clinical notes, the median time difference of 11 seconds was not significant (P=.25, 95% CI -34 to 8). Coding accuracy improved with Easy-ICD for both longer (62% vs 67%) and shorter clinical notes (60% vs 70%), but these differences were not statistically significant (P=.50and P=.17, respectively). User satisfaction ratings (submitted for 37% of cases) showed slightly higher approval for the tool's suggestions on longer clinical notes.

Conclusions: This study demonstrates the potential of AI to transform common tasks in clinical workflows, with ostensible positive impacts on work efficiencies for clinical coding tasks with more demanding longer text sequences. Further studies within hospital workflows are required before these presumed impacts can be more clearly understood.

查看原文本刊更多论文

人工智能改善斯堪的纳维亚临床编码实践：交叉随机对照试验。

背景：临床编码对医院报销、质量评估和卫生保健计划至关重要。然而，在斯堪的纳维亚半岛，编码通常由初级医生或医疗秘书完成，导致编码错误率很高。人工智能（AI）工具，特别是半自动计算机辅助编码工具，有可能减轻行政和临床文件的过度负担。迄今为止，我们对这些工具的了解大多来自基于实验室的评估，这些评估往往无法解释现实世界的复杂性和临床文本的可变性。目的：本研究旨在调查由北挪威大学医院挪威电子卫生研究中心开发的AI工具Easy-ICD（国际疾病分类）是否可以通过减少编码时间和提高现实环境中的数据质量来增强临床编码实践。我们特别研究了根据字数来定义的长篇和短篇临床记录的改善是否不同。方法：在瑞典和挪威进行的1:1交叉随机对照试验中，开发了一种人工智能工具Easy-ICD来辅助临床编码员，并对其准确性和时间进行了测试。参与者被随机分为2组（序列AB或BA），并在编码较长的文本(周期1；平均307，标准差90；单词)和较短的文本(第二阶段；平均166，标准差55；使用我们的工具和不使用我们的工具。这是一个纯粹基于网络的试验，参与者是通过电子邮件招募的。由于每个周期的文本长度不同，编码时间和准确性分别使用Mann-Whitney U测试进行记录和分析。未标记：该试验纳入了17名参与者，但仅分析了15名参与者（300个编码笔记）的数据，排除了2个不完整的记录。根据Mann-Whitney U检验，较长临床文本序列的编码时差中位数为123秒(p)。结论：本研究证明了人工智能对临床工作流程中常见任务的转化潜力，对要求较高的较长文本序列的临床编码任务的工作效率有明显的积极影响。在更清楚地了解这些假定的影响之前，需要对医院工作流程进行进一步研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Medical Internet Research 医学-卫生保健

CiteScore

14.40

自引率

5.40%

发文量

654

审稿时长

1 months

期刊介绍： The Journal of Medical Internet Research (JMIR) is a highly respected publication in the field of health informatics and health services. With a founding date in 1999, JMIR has been a pioneer in the field for over two decades. As a leader in the industry, the journal focuses on digital health, data science, health informatics, and emerging technologies for health, medicine, and biomedical research. It is recognized as a top publication in these disciplines, ranking in the first quartile (Q1) by Impact Factor. Notably, JMIR holds the prestigious position of being ranked #1 on Google Scholar within the "Medical Informatics" discipline.