人类和人工智能在伤口浸渍检测中的诊断准确性差异:重新审视人类专业知识的作用。

IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Florian Kücking, Ursula H Hübner, Dorothee Busch
{"title":"人类和人工智能在伤口浸渍检测中的诊断准确性差异:重新审视人类专业知识的作用。","authors":"Florian Kücking, Ursula H Hübner, Dorothee Busch","doi":"10.1093/jamia/ocaf116","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study aims to compare the diagnostic abilities of humans in wound image assessment with those of an AI-based model, examine how \"expertise\" affects clinicians' diagnostic performance, and investigate the heterogeneity in clinical judgments.</p><p><strong>Materials and methods: </strong>A total of 481 healthcare professionals completed a diagnostic task involving 30 chronic wound images with and without maceration. A convolutional neural network (CNN) classification model performed the same task. To predict human accuracy, participants' \"expertise,\" ie, pertinent formal qualification, work experience, self-confidence, and wound focus, was analyzed in a regression analysis. Human interrater reliability was calculated.</p><p><strong>Results: </strong>Human participants achieved an average accuracy of 79.3% and a maximum accuracy of 85% in the formally qualified group. Achieving 90% accuracy, the CNN performed better but not significantly. Pertinent formal qualification (β  =  0.083, P < .001) and diagnostic self-confidence (β  =  0.015, P = .002) significantly predicted human accuracy, while work experience and focus on wound care had no effect (R2 = 24.3%). Overall interrater reliability was \"fair\" (Kappa = 0.391).</p><p><strong>Discussion: </strong>Among the \"expertise\"-related factors, only the qualification and self-confidence variables influenced diagnostic accuracy. These findings challenge previous assumptions about work experience or job titles defining \"expertise\" and influencing human diagnostic performance.</p><p><strong>Conclusion: </strong>This study offers guidance to future studies when comparing human expert and AI task performance. However, to explain human diagnostic accuracy, \"expertise\" may only serve as one correlate, while additional factors need further research.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Diagnostic accuracy differences in detecting wound maceration between humans and artificial intelligence: the role of human expertise revisited.\",\"authors\":\"Florian Kücking, Ursula H Hübner, Dorothee Busch\",\"doi\":\"10.1093/jamia/ocaf116\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>This study aims to compare the diagnostic abilities of humans in wound image assessment with those of an AI-based model, examine how \\\"expertise\\\" affects clinicians' diagnostic performance, and investigate the heterogeneity in clinical judgments.</p><p><strong>Materials and methods: </strong>A total of 481 healthcare professionals completed a diagnostic task involving 30 chronic wound images with and without maceration. A convolutional neural network (CNN) classification model performed the same task. To predict human accuracy, participants' \\\"expertise,\\\" ie, pertinent formal qualification, work experience, self-confidence, and wound focus, was analyzed in a regression analysis. Human interrater reliability was calculated.</p><p><strong>Results: </strong>Human participants achieved an average accuracy of 79.3% and a maximum accuracy of 85% in the formally qualified group. Achieving 90% accuracy, the CNN performed better but not significantly. Pertinent formal qualification (β  =  0.083, P < .001) and diagnostic self-confidence (β  =  0.015, P = .002) significantly predicted human accuracy, while work experience and focus on wound care had no effect (R2 = 24.3%). Overall interrater reliability was \\\"fair\\\" (Kappa = 0.391).</p><p><strong>Discussion: </strong>Among the \\\"expertise\\\"-related factors, only the qualification and self-confidence variables influenced diagnostic accuracy. These findings challenge previous assumptions about work experience or job titles defining \\\"expertise\\\" and influencing human diagnostic performance.</p><p><strong>Conclusion: </strong>This study offers guidance to future studies when comparing human expert and AI task performance. However, to explain human diagnostic accuracy, \\\"expertise\\\" may only serve as one correlate, while additional factors need further research.</p>\",\"PeriodicalId\":50016,\"journal\":{\"name\":\"Journal of the American Medical Informatics Association\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the American Medical Informatics Association\",\"FirstCategoryId\":\"91\",\"ListUrlMain\":\"https://doi.org/10.1093/jamia/ocaf116\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocaf116","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

目的:本研究旨在比较人类在伤口图像评估中的诊断能力与基于人工智能的模型的诊断能力,研究“专业知识”如何影响临床医生的诊断表现,并研究临床判断的异质性。材料和方法:共有481名医疗保健专业人员完成了一项诊断任务,涉及30张有和没有浸渍的慢性伤口图像。卷积神经网络(CNN)分类模型执行相同的任务。为了预测人类的准确性,参与者的“专业知识”,即相关的正式资格,工作经验,自信和伤口焦点,在回归分析中进行了分析。计算了人间信度。结果:人类参与者的平均准确率为79.3%,正式合格组的最高准确率为85%。达到90%的准确率,CNN表现得更好,但并不明显。讨论:在“专业知识”相关因素中,只有资格和自信变量影响诊断准确性。这些发现挑战了先前关于工作经验或职称定义“专业知识”并影响人类诊断表现的假设。结论:本研究对人类专家和人工智能任务绩效的比较研究具有指导意义。然而,为了解释人类诊断的准确性,“专业知识”可能只是一种关联,而其他因素需要进一步研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Diagnostic accuracy differences in detecting wound maceration between humans and artificial intelligence: the role of human expertise revisited.

Objective: This study aims to compare the diagnostic abilities of humans in wound image assessment with those of an AI-based model, examine how "expertise" affects clinicians' diagnostic performance, and investigate the heterogeneity in clinical judgments.

Materials and methods: A total of 481 healthcare professionals completed a diagnostic task involving 30 chronic wound images with and without maceration. A convolutional neural network (CNN) classification model performed the same task. To predict human accuracy, participants' "expertise," ie, pertinent formal qualification, work experience, self-confidence, and wound focus, was analyzed in a regression analysis. Human interrater reliability was calculated.

Results: Human participants achieved an average accuracy of 79.3% and a maximum accuracy of 85% in the formally qualified group. Achieving 90% accuracy, the CNN performed better but not significantly. Pertinent formal qualification (β  =  0.083, P < .001) and diagnostic self-confidence (β  =  0.015, P = .002) significantly predicted human accuracy, while work experience and focus on wound care had no effect (R2 = 24.3%). Overall interrater reliability was "fair" (Kappa = 0.391).

Discussion: Among the "expertise"-related factors, only the qualification and self-confidence variables influenced diagnostic accuracy. These findings challenge previous assumptions about work experience or job titles defining "expertise" and influencing human diagnostic performance.

Conclusion: This study offers guidance to future studies when comparing human expert and AI task performance. However, to explain human diagnostic accuracy, "expertise" may only serve as one correlate, while additional factors need further research.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of the American Medical Informatics Association
Journal of the American Medical Informatics Association 医学-计算机:跨学科应用
CiteScore
14.50
自引率
7.80%
发文量
230
审稿时长
3-8 weeks
期刊介绍: JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信