{"title":"Understanding Hallucinations in Large Visual and Language Models","authors":"Zheng Yi Ho, Siyuan Liang, Dacheng Tao","doi":"10.1145/3811409","DOIUrl":null,"url":null,"abstract":"The rapid deployment of large language and vision models in real-world applications has intensified the need to address hallucinations—instances where models generate incorrect or incoherent outputs. These failures can spread misinformation and degrade workflows, causing financial and operational harm. Despite extensive research efforts, our understanding of hallucinations remains limited and fragmented. Without clear understanding, solutions risk addressing disparate symptoms rather than root causes, which undermines their effectiveness and generalisability during deployment. To address this, we first introduce a unified, multi-level framework to characterise both image and text hallucinations across broad applications, helping reduce conceptual fragmentation. Then, we trace their root causes to identifiable mechanisms within a model’s lifecycle in a task-modality interleaved manner, fostering a deeper and more holistic understanding. Our investigations reveal hallucinations as predictable consequences of underlying distributions and biases. By enhancing our understanding of hallucinations, this survey lays the groundwork for more effective solutions to hallucinations in generative AI systems.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"1 1","pages":""},"PeriodicalIF":28.0000,"publicationDate":"2026-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3811409","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
The rapid deployment of large language and vision models in real-world applications has intensified the need to address hallucinations—instances where models generate incorrect or incoherent outputs. These failures can spread misinformation and degrade workflows, causing financial and operational harm. Despite extensive research efforts, our understanding of hallucinations remains limited and fragmented. Without clear understanding, solutions risk addressing disparate symptoms rather than root causes, which undermines their effectiveness and generalisability during deployment. To address this, we first introduce a unified, multi-level framework to characterise both image and text hallucinations across broad applications, helping reduce conceptual fragmentation. Then, we trace their root causes to identifiable mechanisms within a model’s lifecycle in a task-modality interleaved manner, fostering a deeper and more holistic understanding. Our investigations reveal hallucinations as predictable consequences of underlying distributions and biases. By enhancing our understanding of hallucinations, this survey lays the groundwork for more effective solutions to hallucinations in generative AI systems.
期刊介绍:
ACM Computing Surveys is an academic journal that focuses on publishing surveys and tutorials on various areas of computing research and practice. The journal aims to provide comprehensive and easily understandable articles that guide readers through the literature and help them understand topics outside their specialties. In terms of impact, CSUR has a high reputation with a 2022 Impact Factor of 16.6. It is ranked 3rd out of 111 journals in the field of Computer Science Theory & Methods.
ACM Computing Surveys is indexed and abstracted in various services, including AI2 Semantic Scholar, Baidu, Clarivate/ISI: JCR, CNKI, DeepDyve, DTU, EBSCO: EDS/HOST, and IET Inspec, among others.