Beyond Binary Decisions: Evaluating the Effects of AI Error Type on Trust and Performance in AI-Assisted Tasks.

IF 3.3 3区心理学 Q1 BEHAVIORAL SCIENCES

Human Factors Pub Date : 2025-03-19 DOI:10.1177/00187208251326795

Jin Yong Kim, Corey Lester, X Jessie Yang

{"title":"Beyond Binary Decisions: Evaluating the Effects of AI Error Type on Trust and Performance in AI-Assisted Tasks.","authors":"Jin Yong Kim, Corey Lester, X Jessie Yang","doi":"10.1177/00187208251326795","DOIUrl":null,"url":null,"abstract":"<p><p>ObjectiveWe investigated how various error patterns from an AI aid in the nonbinary decision scenario influence human operators' trust in the AI system and their task performance.BackgroundExisting research on trust in automation/autonomy predominantly uses the signal detection theory (SDT) to model autonomy performance. The SDT classifies the world into binary states and hence oversimplifies the interaction observed in real-world scenarios. Allowing multi-class classification of the world reveals intriguing error patterns previously unexplored in prior literature.MethodThirty-five participants completed 60 trials of a simulated mental rotation task assisted by an AI with 70-80% reliability. Participants' trust in and dependence on the AI system and their performance were measured. By combining participants' initial performance and the AI aid's performance, five distinct patterns emerged. Mixed-effects models were built to examine the effects of different patterns on trust adjustment, performance, and reaction time.ResultsVarying error patterns from AI impacted performance, reaction times, and trust. Some AI errors provided false reassurance, misleading operators into believing their incorrect decisions were correct, worsening performance and trust. Paradoxically, some AI errors prompted safety checks and verifications, which, despite causing a moderate decrease in trust, ultimately enhanced overall performance.ConclusionThe findings demonstrate that the types of errors made by an AI system significantly affect human trust and performance, emphasizing the need to model the complicated human-AI interaction in real life.ApplicationThese insights can guide the development of AI systems that classify the state of the world into multiple classes, enabling the operators to make more informed and accurate decisions based on feedback.</p>","PeriodicalId":56333,"journal":{"name":"Human Factors","volume":" ","pages":"187208251326795"},"PeriodicalIF":3.3000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12273520/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Factors","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/00187208251326795","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BEHAVIORAL SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

ObjectiveWe investigated how various error patterns from an AI aid in the nonbinary decision scenario influence human operators' trust in the AI system and their task performance.BackgroundExisting research on trust in automation/autonomy predominantly uses the signal detection theory (SDT) to model autonomy performance. The SDT classifies the world into binary states and hence oversimplifies the interaction observed in real-world scenarios. Allowing multi-class classification of the world reveals intriguing error patterns previously unexplored in prior literature.MethodThirty-five participants completed 60 trials of a simulated mental rotation task assisted by an AI with 70-80% reliability. Participants' trust in and dependence on the AI system and their performance were measured. By combining participants' initial performance and the AI aid's performance, five distinct patterns emerged. Mixed-effects models were built to examine the effects of different patterns on trust adjustment, performance, and reaction time.ResultsVarying error patterns from AI impacted performance, reaction times, and trust. Some AI errors provided false reassurance, misleading operators into believing their incorrect decisions were correct, worsening performance and trust. Paradoxically, some AI errors prompted safety checks and verifications, which, despite causing a moderate decrease in trust, ultimately enhanced overall performance.ConclusionThe findings demonstrate that the types of errors made by an AI system significantly affect human trust and performance, emphasizing the need to model the complicated human-AI interaction in real life.ApplicationThese insights can guide the development of AI systems that classify the state of the world into multiple classes, enabling the operators to make more informed and accurate decisions based on feedback.

查看原文本刊更多论文

超越二元决策：评估人工智能错误类型对人工智能辅助任务中信任和绩效的影响。

目的研究人工智能辅助在非二元决策场景下的各种错误模式如何影响人类操作员对人工智能系统的信任及其任务绩效。现有的自动化/自治信任研究主要使用信号检测理论（SDT）来建模自治绩效。SDT将世界分类为二元状态，因此过度简化了在现实场景中观察到的交互。允许对世界进行多类分类揭示了先前文献中未探索的有趣的错误模式。方法35名受试者在人工智能辅助下完成60次模拟心理旋转任务，可靠性为70-80%。测量了参与者对人工智能系统的信任和依赖程度以及他们的表现。通过结合参与者的初始表现和人工智能助手的表现，出现了五种不同的模式。建立混合效应模型，考察不同模式对信任调整、绩效和反应时间的影响。人工智能产生的不同错误模式影响了性能、反应时间和信任。一些人工智能错误提供了虚假的保证，误导操作员相信他们错误的决定是正确的，从而降低了性能和信任。矛盾的是，一些人工智能错误促使了安全检查和验证，尽管这导致了信任的适度下降，但最终提高了整体性能。研究结果表明，人工智能系统所犯的错误类型会显著影响人类的信任和表现，强调了在现实生活中对复杂的人类与人工智能交互进行建模的必要性。这些见解可以指导人工智能系统的发展，将世界状态分为多个类别，使操作员能够根据反馈做出更明智、更准确的决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Human Factors 管理科学-行为科学

CiteScore

10.60

自引率

6.10%

发文量

审稿时长

6-12 weeks

期刊介绍： Human Factors: The Journal of the Human Factors and Ergonomics Society publishes peer-reviewed scientific studies in human factors/ergonomics that present theoretical and practical advances concerning the relationship between people and technologies, tools, environments, and systems. Papers published in Human Factors leverage fundamental knowledge of human capabilities and limitations – and the basic understanding of cognitive, physical, behavioral, physiological, social, developmental, affective, and motivational aspects of human performance – to yield design principles; enhance training, selection, and communication; and ultimately improve human-system interfaces and sociotechnical systems that lead to safer and more effective outcomes.