{"title":"人类和人工智能在伤口浸渍检测中的诊断准确性差异:重新审视人类专业知识的作用。","authors":"Florian Kücking, Ursula H Hübner, Dorothee Busch","doi":"10.1093/jamia/ocaf116","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study aims to compare the diagnostic abilities of humans in wound image assessment with those of an AI-based model, examine how \"expertise\" affects clinicians' diagnostic performance, and investigate the heterogeneity in clinical judgments.</p><p><strong>Materials and methods: </strong>A total of 481 healthcare professionals completed a diagnostic task involving 30 chronic wound images with and without maceration. A convolutional neural network (CNN) classification model performed the same task. To predict human accuracy, participants' \"expertise,\" ie, pertinent formal qualification, work experience, self-confidence, and wound focus, was analyzed in a regression analysis. Human interrater reliability was calculated.</p><p><strong>Results: </strong>Human participants achieved an average accuracy of 79.3% and a maximum accuracy of 85% in the formally qualified group. Achieving 90% accuracy, the CNN performed better but not significantly. Pertinent formal qualification (β = 0.083, P < .001) and diagnostic self-confidence (β = 0.015, P = .002) significantly predicted human accuracy, while work experience and focus on wound care had no effect (R2 = 24.3%). Overall interrater reliability was \"fair\" (Kappa = 0.391).</p><p><strong>Discussion: </strong>Among the \"expertise\"-related factors, only the qualification and self-confidence variables influenced diagnostic accuracy. These findings challenge previous assumptions about work experience or job titles defining \"expertise\" and influencing human diagnostic performance.</p><p><strong>Conclusion: </strong>This study offers guidance to future studies when comparing human expert and AI task performance. However, to explain human diagnostic accuracy, \"expertise\" may only serve as one correlate, while additional factors need further research.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Diagnostic accuracy differences in detecting wound maceration between humans and artificial intelligence: the role of human expertise revisited.\",\"authors\":\"Florian Kücking, Ursula H Hübner, Dorothee Busch\",\"doi\":\"10.1093/jamia/ocaf116\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>This study aims to compare the diagnostic abilities of humans in wound image assessment with those of an AI-based model, examine how \\\"expertise\\\" affects clinicians' diagnostic performance, and investigate the heterogeneity in clinical judgments.</p><p><strong>Materials and methods: </strong>A total of 481 healthcare professionals completed a diagnostic task involving 30 chronic wound images with and without maceration. A convolutional neural network (CNN) classification model performed the same task. To predict human accuracy, participants' \\\"expertise,\\\" ie, pertinent formal qualification, work experience, self-confidence, and wound focus, was analyzed in a regression analysis. Human interrater reliability was calculated.</p><p><strong>Results: </strong>Human participants achieved an average accuracy of 79.3% and a maximum accuracy of 85% in the formally qualified group. Achieving 90% accuracy, the CNN performed better but not significantly. Pertinent formal qualification (β = 0.083, P < .001) and diagnostic self-confidence (β = 0.015, P = .002) significantly predicted human accuracy, while work experience and focus on wound care had no effect (R2 = 24.3%). Overall interrater reliability was \\\"fair\\\" (Kappa = 0.391).</p><p><strong>Discussion: </strong>Among the \\\"expertise\\\"-related factors, only the qualification and self-confidence variables influenced diagnostic accuracy. These findings challenge previous assumptions about work experience or job titles defining \\\"expertise\\\" and influencing human diagnostic performance.</p><p><strong>Conclusion: </strong>This study offers guidance to future studies when comparing human expert and AI task performance. However, to explain human diagnostic accuracy, \\\"expertise\\\" may only serve as one correlate, while additional factors need further research.</p>\",\"PeriodicalId\":50016,\"journal\":{\"name\":\"Journal of the American Medical Informatics Association\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the American Medical Informatics Association\",\"FirstCategoryId\":\"91\",\"ListUrlMain\":\"https://doi.org/10.1093/jamia/ocaf116\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocaf116","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Diagnostic accuracy differences in detecting wound maceration between humans and artificial intelligence: the role of human expertise revisited.
Objective: This study aims to compare the diagnostic abilities of humans in wound image assessment with those of an AI-based model, examine how "expertise" affects clinicians' diagnostic performance, and investigate the heterogeneity in clinical judgments.
Materials and methods: A total of 481 healthcare professionals completed a diagnostic task involving 30 chronic wound images with and without maceration. A convolutional neural network (CNN) classification model performed the same task. To predict human accuracy, participants' "expertise," ie, pertinent formal qualification, work experience, self-confidence, and wound focus, was analyzed in a regression analysis. Human interrater reliability was calculated.
Results: Human participants achieved an average accuracy of 79.3% and a maximum accuracy of 85% in the formally qualified group. Achieving 90% accuracy, the CNN performed better but not significantly. Pertinent formal qualification (β = 0.083, P < .001) and diagnostic self-confidence (β = 0.015, P = .002) significantly predicted human accuracy, while work experience and focus on wound care had no effect (R2 = 24.3%). Overall interrater reliability was "fair" (Kappa = 0.391).
Discussion: Among the "expertise"-related factors, only the qualification and self-confidence variables influenced diagnostic accuracy. These findings challenge previous assumptions about work experience or job titles defining "expertise" and influencing human diagnostic performance.
Conclusion: This study offers guidance to future studies when comparing human expert and AI task performance. However, to explain human diagnostic accuracy, "expertise" may only serve as one correlate, while additional factors need further research.
期刊介绍:
JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.