Trust and accuracy in AI: Optometrists favor multimodal AI systems over unimodal for glaucoma diagnosis in collaborative environment

IF 6.3 2区医学 Q1 BIOLOGY

Computers in biology and medicine Pub Date : 2025-09-29 DOI:10.1016/j.compbiomed.2025.111132

Faisal Ghaffar , Yousuf Zia Islam , Nadine Furtado , Catherine Burns

{"title":"Trust and accuracy in AI: Optometrists favor multimodal AI systems over unimodal for glaucoma diagnosis in collaborative environment","authors":"Faisal Ghaffar , Yousuf Zia Islam , Nadine Furtado , Catherine Burns","doi":"10.1016/j.compbiomed.2025.111132","DOIUrl":null,"url":null,"abstract":"<div><h3>Background:</h3><div>User trust and decision accuracy are crucial for the successful collaboration of humans and Artificial Intelligence (AI) systems, especially in clinical settings such as glaucoma diagnosis. Both trust and accuracy are influenced by the data modality used by AI systems, which directly impacts the effectiveness of human-AI collaboration.</div></div><div><h3>Objective:</h3><div>The objective of this study is to discover the modality of an AI system that aligns best with an optometrist’s mental model. This was achieved by comparing trust levels between unimodal and multimodal AI systems used for glaucoma diagnosis. Additionally, we explore the impact of modality on various targets of user trust and user performance.</div></div><div><h3>Methods:</h3><div>We conducted a within-subject study with 20 optometrists, who were presented with both unimodal and multimodal AI mock-up systems specifically designed for glaucoma diagnosis. Trust was measured across five key targets using a 5 point Likert scale questionnaires. Statistical analysis was performed to assess trust differences between the unimodal and multimodal AI systems. Optometrist performance was evaluated based on the alignment of their decisions with those of the unimodal and multimodal AI systems.</div></div><div><h3>Results:</h3><div>The results showed that the multimodal system had a higher average trust rating of 3.98 on a Likert scale, indicating greater trust compared to the unimodal system, which had an average trust rating of 3.00. This difference was statistically significant (<em>p</em><span><math><mo><</mo></math></span>.001), with further analysis revealing significant variation across all trust targets (<em>p</em><span><math><mo><</mo></math></span>.001). Additionally, optometrists demonstrated higher F1 scores with the multimodal system (43.1%) compared to the unimodal system (37.3%), while accuracy remained comparable between the two systems (63.0% for multimodal and 63.3% for unimodal). However, major differences across these metrics were observed in relation to optometrist’s expertise.</div></div><div><h3>Conclusions:</h3><div>Multimodal AI systems, which use the same data modality as optometrists and align more closely with their mental models and decision-making processes, result in higher user trust and improved diagnostic performance. Therefore, for effective human-AI collaboration in healthcare, specifically for glaucoma diagnosis, AI systems should be designed to utilize the same data sources as the human counterparts, ensuring consistency and improving both trust and decision accuracy.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"198 ","pages":"Article 111132"},"PeriodicalIF":6.3000,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525014854","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background:

User trust and decision accuracy are crucial for the successful collaboration of humans and Artificial Intelligence (AI) systems, especially in clinical settings such as glaucoma diagnosis. Both trust and accuracy are influenced by the data modality used by AI systems, which directly impacts the effectiveness of human-AI collaboration.

Objective:

The objective of this study is to discover the modality of an AI system that aligns best with an optometrist’s mental model. This was achieved by comparing trust levels between unimodal and multimodal AI systems used for glaucoma diagnosis. Additionally, we explore the impact of modality on various targets of user trust and user performance.

Methods:

We conducted a within-subject study with 20 optometrists, who were presented with both unimodal and multimodal AI mock-up systems specifically designed for glaucoma diagnosis. Trust was measured across five key targets using a 5 point Likert scale questionnaires. Statistical analysis was performed to assess trust differences between the unimodal and multimodal AI systems. Optometrist performance was evaluated based on the alignment of their decisions with those of the unimodal and multimodal AI systems.

Results:

The results showed that the multimodal system had a higher average trust rating of 3.98 on a Likert scale, indicating greater trust compared to the unimodal system, which had an average trust rating of 3.00. This difference was statistically significant (p

<

.001), with further analysis revealing significant variation across all trust targets (p

<

.001). Additionally, optometrists demonstrated higher F1 scores with the multimodal system (43.1%) compared to the unimodal system (37.3%), while accuracy remained comparable between the two systems (63.0% for multimodal and 63.3% for unimodal). However, major differences across these metrics were observed in relation to optometrist’s expertise.

Conclusions:

Multimodal AI systems, which use the same data modality as optometrists and align more closely with their mental models and decision-making processes, result in higher user trust and improved diagnostic performance. Therefore, for effective human-AI collaboration in healthcare, specifically for glaucoma diagnosis, AI systems should be designed to utilize the same data sources as the human counterparts, ensuring consistency and improving both trust and decision accuracy.

Abstract Image

查看原文本刊更多论文

人工智能的信任和准确性：在协作环境中，验光师更喜欢多模式人工智能系统，而不是单模式的青光眼诊断。

背景：用户信任和决策准确性对于人类和人工智能（AI）系统的成功合作至关重要，特别是在青光眼诊断等临床环境中。人工智能系统所使用的数据模式会影响信任和准确性，从而直接影响人类与人工智能协作的有效性。目的：本研究的目的是发现最符合验光师心理模型的人工智能系统的模式。这是通过比较用于青光眼诊断的单模态和多模态人工智能系统之间的信任水平来实现的。此外，我们探讨了模态对用户信任和用户绩效的各种目标的影响。方法：我们对20名验光师进行了一项受试者内研究，向他们展示了专门为青光眼诊断设计的单模态和多模态人工智能模型系统。信任是通过5分李克特量表调查五个关键目标来衡量的。进行统计分析以评估单模态和多模态人工智能系统之间的信任差异。根据验光师的决策与单模态和多模态人工智能系统的决策的一致性来评估验光师的表现。结果：结果显示，多模态系统在李克特量表上的平均信任评分为3.98，高于单模态系统的平均信任评分3.00。结论：多模式人工智能系统使用与验光师相同的数据模式，并与他们的心理模型和决策过程更紧密地结合在一起，从而提高了用户信任度，提高了诊断性能。因此，为了在医疗保健领域，特别是青光眼诊断领域实现有效的人类-人工智能协作，人工智能系统应该设计成与人类同行使用相同的数据源，以确保一致性，提高信任和决策准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers in biology and medicine 工程技术-工程：生物医学

CiteScore

11.70

自引率

10.40%

发文量

1086

审稿时长

74 days

期刊介绍： Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.