如果你搜索我就会发现：当上下文网络搜索结果影响对幻觉的检测时

IF 8.9 1区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL

Computers in Human Behavior Pub Date : 2025-08-23 DOI:10.1016/j.chb.2025.108763

Mahjabin Nahar , Eun-Ju Lee , Jin Won Park , Dongwon Lee

{"title":"如果你搜索我就会发现：当上下文网络搜索结果影响对幻觉的检测时","authors":"Mahjabin Nahar , Eun-Ju Lee , Jin Won Park , Dongwon Lee","doi":"10.1016/j.chb.2025.108763","DOIUrl":null,"url":null,"abstract":"<div><div>While we increasingly rely on large language models (LLMs) for various tasks, these models are known to produce inaccurate content or ‘hallucinations’ with potentially disastrous consequences. The recent integration of web search results into LLMs prompts the question of whether people utilize them to verify the generated content, thereby accurately detecting hallucinations. An online experiment (<span><math><mrow><mi>N</mi><mo>=</mo><mn>560</mn></mrow></math></span>) investigated how the provision of search results, either static (i.e., fixed search results provided by LLM) or dynamic (i.e., participant-led searches), affects participants’ perceived accuracy of LLM-generated content (i.e., genuine, minor hallucination, major hallucination), self-confidence in accuracy ratings, as well as their overall evaluation of the LLM, as compared to the control condition (i.e., no search results). Results showed that participants in both static and dynamic conditions (vs. control) rated hallucinated content to be less accurate and perceived the LLM more negatively. However, those in the dynamic condition rated genuine content as more accurate and demonstrated greater overall self-confidence in their assessments than those in the static search or control conditions. We highlighted practical implications of incorporating web search functionality into LLMs in real-world contexts.</div></div>","PeriodicalId":48471,"journal":{"name":"Computers in Human Behavior","volume":"173 ","pages":"Article 108763"},"PeriodicalIF":8.9000,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Catch me if you search: When contextual web search results affect the detection of hallucinations\",\"authors\":\"Mahjabin Nahar , Eun-Ju Lee , Jin Won Park , Dongwon Lee\",\"doi\":\"10.1016/j.chb.2025.108763\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>While we increasingly rely on large language models (LLMs) for various tasks, these models are known to produce inaccurate content or ‘hallucinations’ with potentially disastrous consequences. The recent integration of web search results into LLMs prompts the question of whether people utilize them to verify the generated content, thereby accurately detecting hallucinations. An online experiment (<span><math><mrow><mi>N</mi><mo>=</mo><mn>560</mn></mrow></math></span>) investigated how the provision of search results, either static (i.e., fixed search results provided by LLM) or dynamic (i.e., participant-led searches), affects participants’ perceived accuracy of LLM-generated content (i.e., genuine, minor hallucination, major hallucination), self-confidence in accuracy ratings, as well as their overall evaluation of the LLM, as compared to the control condition (i.e., no search results). Results showed that participants in both static and dynamic conditions (vs. control) rated hallucinated content to be less accurate and perceived the LLM more negatively. However, those in the dynamic condition rated genuine content as more accurate and demonstrated greater overall self-confidence in their assessments than those in the static search or control conditions. We highlighted practical implications of incorporating web search functionality into LLMs in real-world contexts.</div></div>\",\"PeriodicalId\":48471,\"journal\":{\"name\":\"Computers in Human Behavior\",\"volume\":\"173 \",\"pages\":\"Article 108763\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-08-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in Human Behavior\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0747563225002109\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in Human Behavior","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0747563225002109","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

摘要

虽然我们越来越多地依赖大型语言模型（llm）来完成各种任务，但众所周知，这些模型会产生不准确的内容或“幻觉”，从而带来潜在的灾难性后果。最近将网络搜索结果整合到法学硕士中引发了一个问题，即人们是否利用它们来验证生成的内容，从而准确地检测幻觉。一项在线实验（N=560）调查了与控制条件（即没有搜索结果）相比，提供静态（即LLM提供的固定搜索结果）或动态（即参与者主导的搜索）的搜索结果如何影响参与者对LLM生成内容的感知准确性（即真实，轻微幻觉，主要幻觉），准确性评级的自信心以及他们对LLM的总体评估。结果显示，静态和动态条件下的参与者（与对照组相比）对幻觉内容的评价更不准确，对LLM的看法更消极。然而，与静态搜索或控制条件下的人相比，动态条件下的人对真实内容的评价更准确，在评估中表现出更大的整体自信。我们强调了在现实环境中将网络搜索功能纳入法学硕士的实际意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Catch me if you search: When contextual web search results affect the detection of hallucinations

While we increasingly rely on large language models (LLMs) for various tasks, these models are known to produce inaccurate content or ‘hallucinations’ with potentially disastrous consequences. The recent integration of web search results into LLMs prompts the question of whether people utilize them to verify the generated content, thereby accurately detecting hallucinations. An online experiment (

N = 560

) investigated how the provision of search results, either static (i.e., fixed search results provided by LLM) or dynamic (i.e., participant-led searches), affects participants’ perceived accuracy of LLM-generated content (i.e., genuine, minor hallucination, major hallucination), self-confidence in accuracy ratings, as well as their overall evaluation of the LLM, as compared to the control condition (i.e., no search results). Results showed that participants in both static and dynamic conditions (vs. control) rated hallucinated content to be less accurate and perceived the LLM more negatively. However, those in the dynamic condition rated genuine content as more accurate and demonstrated greater overall self-confidence in their assessments than those in the static search or control conditions. We highlighted practical implications of incorporating web search functionality into LLMs in real-world contexts.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers in Human Behavior Multiple-

CiteScore

19.10

自引率

4.00%

发文量

381

审稿时长

40 days

期刊介绍： Computers in Human Behavior is a scholarly journal that explores the psychological aspects of computer use. It covers original theoretical works, research reports, literature reviews, and software and book reviews. The journal examines both the use of computers in psychology, psychiatry, and related fields, and the psychological impact of computer use on individuals, groups, and society. Articles discuss topics such as professional practice, training, research, human development, learning, cognition, personality, and social interactions. It focuses on human interactions with computers, considering the computer as a medium through which human behaviors are shaped and expressed. Professionals interested in the psychological aspects of computer use will find this journal valuable, even with limited knowledge of computers.