{"title":"Polite AI mitigates user susceptibility to AI hallucinations.","authors":"Richard Pak, Ericka Rovira, Anne McLaughlin","doi":"10.1080/00140139.2024.2434604","DOIUrl":null,"url":null,"abstract":"<p><p>With their increased capability, AI-based chatbots have become increasingly popular tools to help users answer complex queries. However, these chatbots may hallucinate, or generate incorrect but very plausible-sounding information, more frequently than previously thought. Thus, it is crucial to examine strategies to mitigate human susceptibility to hallucinated output. In a between-subjects experiment, participants completed a difficult quiz with assistance from either a polite or neutral-toned AI chatbot, which occasionally provided hallucinated (incorrect) information. Signal detection analysis revealed that participants interacting with polite-AI showed modestly higher sensitivity in detecting hallucinations and a more conservative response bias compared to those interacting with neutral-toned AI. While the observed effect sizes were modest, even small improvements in users' ability to detect AI hallucinations can have significant consequences, particularly in high-stakes domains or when aggregated across millions of AI interactions.</p>","PeriodicalId":50503,"journal":{"name":"Ergonomics","volume":" ","pages":"1-11"},"PeriodicalIF":2.0000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ergonomics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1080/00140139.2024.2434604","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
引用次数: 0
Abstract
With their increased capability, AI-based chatbots have become increasingly popular tools to help users answer complex queries. However, these chatbots may hallucinate, or generate incorrect but very plausible-sounding information, more frequently than previously thought. Thus, it is crucial to examine strategies to mitigate human susceptibility to hallucinated output. In a between-subjects experiment, participants completed a difficult quiz with assistance from either a polite or neutral-toned AI chatbot, which occasionally provided hallucinated (incorrect) information. Signal detection analysis revealed that participants interacting with polite-AI showed modestly higher sensitivity in detecting hallucinations and a more conservative response bias compared to those interacting with neutral-toned AI. While the observed effect sizes were modest, even small improvements in users' ability to detect AI hallucinations can have significant consequences, particularly in high-stakes domains or when aggregated across millions of AI interactions.
期刊介绍:
Ergonomics, also known as human factors, is the scientific discipline that seeks to understand and improve human interactions with products, equipment, environments and systems. Drawing upon human biology, psychology, engineering and design, Ergonomics aims to develop and apply knowledge and techniques to optimise system performance, whilst protecting the health, safety and well-being of individuals involved. The attention of ergonomics extends across work, leisure and other aspects of our daily lives.
The journal Ergonomics is an international refereed publication, with a 60 year tradition of disseminating high quality research. Original submissions, both theoretical and applied, are invited from across the subject, including physical, cognitive, organisational and environmental ergonomics. Papers reporting the findings of research from cognate disciplines are also welcome, where these contribute to understanding equipment, tasks, jobs, systems and environments and the corresponding needs, abilities and limitations of people.
All published research articles in this journal have undergone rigorous peer review, based on initial editor screening and anonymous refereeing by independent expert referees.