A comparative analysis of machine learning models and human expertise for nursing intervention classification.

IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES
JAMIA Open Pub Date : 2025-06-27 eCollection Date: 2025-06-01 DOI:10.1093/jamiaopen/ooaf057
Jerome Niyirora, Lynne Longtin, Cynthia Grabski, David Patrishkoff, Andriana Semko
{"title":"A comparative analysis of machine learning models and human expertise for nursing intervention classification.","authors":"Jerome Niyirora, Lynne Longtin, Cynthia Grabski, David Patrishkoff, Andriana Semko","doi":"10.1093/jamiaopen/ooaf057","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study compares the performance of machine learning (ML) models and human experts in mapping unstructured nursing notes to the standardized Nursing Interventions Classification (NIC) system. The aim is to advance automated nursing documentation classification, facilitating cross-facility benchmarking of patient care and organizational outcomes.</p><p><strong>Materials and methods: </strong>We developed and compared 4 ML models: TF-IDF text-based vectorization, UMLS semantic mapping, fine-tuned GPT-4o mini, and Bio-Clinical BERT. These models were evaluated against classifications provided by 2 expert nurses using a dataset of de-identified home healthcare nursing notes obtained from a Florida, USA-based medical clearinghouse. Model performance was assessed using agreement statistics, precision, recall, F1 scores, and Cohen's Kappa.</p><p><strong>Results: </strong>Human raters achieved the highest agreement with consensus labels, scoring 0.75 and 0.62, with corresponding F1 scores of 0.61 and 0.45, respectively. In comparison, ML models showed lower performance, with GPT achieving the best among them (agreement: 0.50, F1 score: 0.31). A distribution analysis of NIC categories revealed that ML models performed well in prevalent and clearly defined categories, such as drug management, but struggled with minority classes and context-dependent interventions, like information management.</p><p><strong>Discussion: </strong>Current ML approaches show promise in supporting clinical classification tasks, but the performance gap in handling complex, context-dependent interventions highlights the need for improved methods that can better capture the nuanced nature of clinical documentation. Future research should focus on developing methods to process clinical terminology and context-specific documentation with greater precision and adaptability.</p><p><strong>Conclusion: </strong>Current ML models can aid-but not fully replace-human judgment in classifying nuanced nursing interventions.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 3","pages":"ooaf057"},"PeriodicalIF":3.4000,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12203540/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMIA Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jamiaopen/ooaf057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: This study compares the performance of machine learning (ML) models and human experts in mapping unstructured nursing notes to the standardized Nursing Interventions Classification (NIC) system. The aim is to advance automated nursing documentation classification, facilitating cross-facility benchmarking of patient care and organizational outcomes.

Materials and methods: We developed and compared 4 ML models: TF-IDF text-based vectorization, UMLS semantic mapping, fine-tuned GPT-4o mini, and Bio-Clinical BERT. These models were evaluated against classifications provided by 2 expert nurses using a dataset of de-identified home healthcare nursing notes obtained from a Florida, USA-based medical clearinghouse. Model performance was assessed using agreement statistics, precision, recall, F1 scores, and Cohen's Kappa.

Results: Human raters achieved the highest agreement with consensus labels, scoring 0.75 and 0.62, with corresponding F1 scores of 0.61 and 0.45, respectively. In comparison, ML models showed lower performance, with GPT achieving the best among them (agreement: 0.50, F1 score: 0.31). A distribution analysis of NIC categories revealed that ML models performed well in prevalent and clearly defined categories, such as drug management, but struggled with minority classes and context-dependent interventions, like information management.

Discussion: Current ML approaches show promise in supporting clinical classification tasks, but the performance gap in handling complex, context-dependent interventions highlights the need for improved methods that can better capture the nuanced nature of clinical documentation. Future research should focus on developing methods to process clinical terminology and context-specific documentation with greater precision and adaptability.

Conclusion: Current ML models can aid-but not fully replace-human judgment in classifying nuanced nursing interventions.

护理干预分类中机器学习模型与人类专业知识的比较分析。
目的:比较机器学习(ML)模型和人类专家在将非结构化护理笔记映射到标准化护理干预分类(NIC)系统中的表现。目的是推进自动化护理文件分类,促进患者护理和组织结果的跨设施基准。材料和方法:我们开发并比较了4种ML模型:基于TF-IDF文本的矢量化,UMLS语义映射,微调gpt - 40mini和生物临床BERT。这些模型是根据2名专家护士提供的分类进行评估的,这些分类使用了从美国佛罗里达州的医疗信息交换所获得的去识别的家庭保健护理笔记数据集。使用协议统计、精度、召回率、F1分数和Cohen’s Kappa来评估模型性能。结果:人类评分者与共识标签的一致性最高,得分分别为0.75和0.62,相应的F1得分分别为0.61和0.45。相比之下,ML模型的性能较低,其中GPT达到最佳(一致性:0.50,F1分数:0.31)。对NIC类别的分布分析显示,ML模型在流行和明确定义的类别(如药物管理)中表现良好,但在少数类别和上下文相关干预(如信息管理)中表现不佳。讨论:当前的机器学习方法在支持临床分类任务方面显示出希望,但是在处理复杂的、上下文相关的干预措施方面的性能差距突出了对改进方法的需求,这些方法可以更好地捕捉临床文档的细微差别。未来的研究应侧重于开发处理临床术语和上下文特定文件的方法,以更高的精度和适应性。结论:目前的机器学习模型可以帮助-但不能完全取代-人类对细致护理干预的分类判断。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
JAMIA Open
JAMIA Open Medicine-Health Informatics
CiteScore
4.10
自引率
4.80%
发文量
102
审稿时长
16 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信