Zero-resource Hallucination Detection for Text Generation via Graph-based Contextual Knowledge Triples Modeling

Xinyue Fang, Zhen Huang, Zhiliang Tian, Minghui Fang, Ziyi Pan, Quntian Fang, Zhihua Wen, Hengyue Pan, Dongsheng Li
{"title":"Zero-resource Hallucination Detection for Text Generation via Graph-based Contextual Knowledge Triples Modeling","authors":"Xinyue Fang, Zhen Huang, Zhiliang Tian, Minghui Fang, Ziyi Pan, Quntian Fang, Zhihua Wen, Hengyue Pan, Dongsheng Li","doi":"arxiv-2409.11283","DOIUrl":null,"url":null,"abstract":"LLMs obtain remarkable performance but suffer from hallucinations. Most\nresearch on detecting hallucination focuses on the questions with short and\nconcrete correct answers that are easy to check the faithfulness. Hallucination\ndetections for text generation with open-ended answers are more challenging.\nSome researchers use external knowledge to detect hallucinations in generated\ntexts, but external resources for specific scenarios are hard to access. Recent\nstudies on detecting hallucinations in long text without external resources\nconduct consistency comparison among multiple sampled outputs. To handle long\ntexts, researchers split long texts into multiple facts and individually\ncompare the consistency of each pairs of facts. However, these methods (1)\nhardly achieve alignment among multiple facts; (2) overlook dependencies\nbetween multiple contextual facts. In this paper, we propose a graph-based\ncontext-aware (GCA) hallucination detection for text generations, which aligns\nknowledge facts and considers the dependencies between contextual knowledge\ntriples in consistency comparison. Particularly, to align multiple facts, we\nconduct a triple-oriented response segmentation to extract multiple knowledge\ntriples. To model dependencies among contextual knowledge triple (facts), we\nconstruct contextual triple into a graph and enhance triples' interactions via\nmessage passing and aggregating via RGCN. To avoid the omission of knowledge\ntriples in long text, we conduct a LLM-based reverse verification via\nreconstructing the knowledge triples. Experiments show that our model enhances\nhallucination detection and excels all baselines.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"50 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computation and Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11283","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

LLMs obtain remarkable performance but suffer from hallucinations. Most research on detecting hallucination focuses on the questions with short and concrete correct answers that are easy to check the faithfulness. Hallucination detections for text generation with open-ended answers are more challenging. Some researchers use external knowledge to detect hallucinations in generated texts, but external resources for specific scenarios are hard to access. Recent studies on detecting hallucinations in long text without external resources conduct consistency comparison among multiple sampled outputs. To handle long texts, researchers split long texts into multiple facts and individually compare the consistency of each pairs of facts. However, these methods (1) hardly achieve alignment among multiple facts; (2) overlook dependencies between multiple contextual facts. In this paper, we propose a graph-based context-aware (GCA) hallucination detection for text generations, which aligns knowledge facts and considers the dependencies between contextual knowledge triples in consistency comparison. Particularly, to align multiple facts, we conduct a triple-oriented response segmentation to extract multiple knowledge triples. To model dependencies among contextual knowledge triple (facts), we construct contextual triple into a graph and enhance triples' interactions via message passing and aggregating via RGCN. To avoid the omission of knowledge triples in long text, we conduct a LLM-based reverse verification via reconstructing the knowledge triples. Experiments show that our model enhances hallucination detection and excels all baselines.
通过基于图的上下文知识三元组建模检测文本生成中的零资源幻觉
LLMs 性能卓越,但也存在幻觉。大多数关于幻觉检测的研究都集中在具有简短而具体的正确答案的问题上,这样的问题很容易检测其忠实性。一些研究人员使用外部知识来检测生成文本中的幻觉,但很难获取特定场景的外部资源。最近关于在没有外部资源的情况下检测长文本中幻觉的研究在多个采样输出中进行了一致性比较。为了处理长文本,研究人员将长文本分割成多个事实,并单独比较每对事实的一致性。然而,这些方法(1)很难实现多个事实之间的一致性;(2)忽略了多个上下文事实之间的依赖关系。本文提出了一种基于图的上下文感知(GCA)的文本生成幻觉检测方法,该方法对齐知识事实,并在一致性比较中考虑上下文知识三元组之间的依赖关系。特别是,为了对齐多个事实,我们进行了面向三重的响应分割,以提取多个知识要素。为了模拟上下文知识三元(事实)之间的依赖关系,我们将上下文三元构建成图,并通过 RGCN 进行信息传递和聚合,增强三元之间的交互。为了避免长文本中知识三元组的遗漏,我们通过重新构建知识三元组来进行基于 LLM 的反向验证。实验表明,我们的模型增强了幻觉检测能力,并优于所有基线模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信