Reporting radiographers' interaction with Artificial Intelligence-How do different forms of AI feedback impact trust and decision switching?

PLOS digital health Pub Date : 2024-08-07 eCollection Date: 2024-08-01 DOI:10.1371/journal.pdig.0000560
Clare Rainey, Raymond Bond, Jonathan McConnell, Ciara Hughes, Devinder Kumar, Sonyia McFadden
{"title":"Reporting radiographers' interaction with Artificial Intelligence-How do different forms of AI feedback impact trust and decision switching?","authors":"Clare Rainey, Raymond Bond, Jonathan McConnell, Ciara Hughes, Devinder Kumar, Sonyia McFadden","doi":"10.1371/journal.pdig.0000560","DOIUrl":null,"url":null,"abstract":"<p><p>Artificial Intelligence (AI) has been increasingly integrated into healthcare settings, including the radiology department to aid radiographic image interpretation, including reporting by radiographers. Trust has been cited as a barrier to effective clinical implementation of AI. Appropriating trust will be important in the future with AI to ensure the ethical use of these systems for the benefit of the patient, clinician and health services. Means of explainable AI, such as heatmaps have been proposed to increase AI transparency and trust by elucidating which parts of image the AI 'focussed on' when making its decision. The aim of this novel study was to quantify the impact of different forms of AI feedback on the expert clinicians' trust. Whilst this study was conducted in the UK, it has potential international application and impact for AI interface design, either globally or in countries with similar cultural and/or economic status to the UK. A convolutional neural network was built for this study; trained, validated and tested on a publicly available dataset of MUsculoskeletal RAdiographs (MURA), with binary diagnoses and Gradient Class Activation Maps (GradCAM) as outputs. Reporting radiographers (n = 12) were recruited to this study from all four regions of the UK. Qualtrics was used to present each participant with a total of 18 complete examinations from the MURA test dataset (each examination contained more than one radiographic image). Participants were presented with the images first, images with heatmaps next and finally an AI binary diagnosis in a sequential order. Perception of trust in the AI systems was obtained following the presentation of each heatmap and binary feedback. The participants were asked to indicate whether they would change their mind (or decision switch) in response to the AI feedback. Participants disagreed with the AI heatmaps for the abnormal examinations 45.8% of the time and agreed with binary feedback on 86.7% of examinations (26/30 presentations).'Only two participants indicated that they would decision switch in response to all AI feedback (GradCAM and binary) (0.7%, n = 2) across all datasets. 22.2% (n = 32) of participants agreed with the localisation of pathology on the heatmap. The level of agreement with the GradCAM and binary diagnosis was found to be correlated with trust (GradCAM:-.515;-.584, significant large negative correlation at 0.01 level (p = < .01 and-.309;-.369, significant medium negative correlation at .01 level (p = < .01) for GradCAM and binary diagnosis respectively). This study shows that the extent of agreement with both AI binary diagnosis and heatmap is correlated with trust in AI for the participants in this study, where greater agreement with the form of AI feedback is associated with greater trust in AI, in particular in the heatmap form of AI feedback. Forms of explainable AI should be developed with cognisance of the need for precision and accuracy in localisation to promote appropriate trust in clinical end users.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 8","pages":"e0000560"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11305567/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLOS digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1371/journal.pdig.0000560","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Artificial Intelligence (AI) has been increasingly integrated into healthcare settings, including the radiology department to aid radiographic image interpretation, including reporting by radiographers. Trust has been cited as a barrier to effective clinical implementation of AI. Appropriating trust will be important in the future with AI to ensure the ethical use of these systems for the benefit of the patient, clinician and health services. Means of explainable AI, such as heatmaps have been proposed to increase AI transparency and trust by elucidating which parts of image the AI 'focussed on' when making its decision. The aim of this novel study was to quantify the impact of different forms of AI feedback on the expert clinicians' trust. Whilst this study was conducted in the UK, it has potential international application and impact for AI interface design, either globally or in countries with similar cultural and/or economic status to the UK. A convolutional neural network was built for this study; trained, validated and tested on a publicly available dataset of MUsculoskeletal RAdiographs (MURA), with binary diagnoses and Gradient Class Activation Maps (GradCAM) as outputs. Reporting radiographers (n = 12) were recruited to this study from all four regions of the UK. Qualtrics was used to present each participant with a total of 18 complete examinations from the MURA test dataset (each examination contained more than one radiographic image). Participants were presented with the images first, images with heatmaps next and finally an AI binary diagnosis in a sequential order. Perception of trust in the AI systems was obtained following the presentation of each heatmap and binary feedback. The participants were asked to indicate whether they would change their mind (or decision switch) in response to the AI feedback. Participants disagreed with the AI heatmaps for the abnormal examinations 45.8% of the time and agreed with binary feedback on 86.7% of examinations (26/30 presentations).'Only two participants indicated that they would decision switch in response to all AI feedback (GradCAM and binary) (0.7%, n = 2) across all datasets. 22.2% (n = 32) of participants agreed with the localisation of pathology on the heatmap. The level of agreement with the GradCAM and binary diagnosis was found to be correlated with trust (GradCAM:-.515;-.584, significant large negative correlation at 0.01 level (p = < .01 and-.309;-.369, significant medium negative correlation at .01 level (p = < .01) for GradCAM and binary diagnosis respectively). This study shows that the extent of agreement with both AI binary diagnosis and heatmap is correlated with trust in AI for the participants in this study, where greater agreement with the form of AI feedback is associated with greater trust in AI, in particular in the heatmap form of AI feedback. Forms of explainable AI should be developed with cognisance of the need for precision and accuracy in localisation to promote appropriate trust in clinical end users.

报告放射技师与人工智能的互动--不同形式的人工智能反馈如何影响信任和决策转换?
人工智能(AI)已被越来越多地集成到医疗机构中,包括放射科,以帮助放射图像判读,包括放射技师的报告。信任被认为是人工智能有效临床应用的障碍。在未来的人工智能应用中,信任的适当运用将非常重要,以确保这些系统的使用符合道德规范,从而为患者、临床医生和医疗服务带来益处。有人提出了热图等可解释人工智能的方法,通过阐明人工智能在做出决定时 "关注 "图像的哪些部分来提高人工智能的透明度和信任度。这项新颖研究的目的是量化不同形式的人工智能反馈对临床专家信任度的影响。虽然这项研究是在英国进行的,但它对全球或与英国文化和/或经济地位相似的国家的人工智能界面设计具有潜在的国际应用和影响。本研究建立了一个卷积神经网络,并在一个公开的肌肉骨骼RAdiographs(MURA)数据集上进行了训练、验证和测试,将二元诊断和梯度类激活图(GradCAM)作为输出。本研究招募了来自英国所有四个地区的报告放射技师(n = 12)。研究人员使用 Qualtrics 向每位参与者展示了 MURA 测试数据集中的 18 项完整检查(每项检查都包含一张以上的放射影像)。参与者首先看到的是图像,然后是带有热图的图像,最后是人工智能二元诊断,顺序依次进行。在展示每张热图和二进制反馈后,对人工智能系统的信任度进行评估。参与者被要求说明他们是否会根据人工智能反馈改变主意(或转换决策)。在所有数据集中,只有两名参与者表示他们会根据所有人工智能反馈(GradCAM 和二进制反馈)进行决策转换(0.7%,n = 2)。22.2%(n = 32)的参与者同意热图上的病理定位。对 GradCAM 和二元诊断的同意程度与信任度相关(GradCAM:-.515;-.584,在 0.01 水平上存在显著的大负相关(p = < .01 和-.309;-.369,在 .01 水平上存在显著的中负相关(p = < .01))。本研究表明,对人工智能二元诊断和热图的同意程度与本研究参与者对人工智能的信任度相关,其中对人工智能反馈形式的同意程度越高,对人工智能的信任度就越高,尤其是对人工智能反馈的热图形式。在开发可解释的人工智能形式时,应认识到定位精度和准确性的必要性,以提高临床终端用户的适当信任度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信