The Choice of Textual Knowledge Base in Automated Claim Checking

IF 1.5 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS
Dominik Stammbach, Boya Zhang, Elliott Ash
{"title":"The Choice of Textual Knowledge Base in Automated Claim Checking","authors":"Dominik Stammbach, Boya Zhang, Elliott Ash","doi":"10.1145/3561389","DOIUrl":null,"url":null,"abstract":"Automated claim checking is the task of determining the veracity of a claim given evidence retrieved from a textual knowledge base of trustworthy facts. While previous work has taken the knowledge base as given and optimized the claim-checking pipeline, we take the opposite approach—taking the pipeline as given, we explore the choice of the knowledge base. Our first insight is that a claim-checking pipeline can be transferred to a new domain of claims with access to a knowledge base from the new domain. Second, we do not find a “universally best” knowledge base—higher domain overlap of a task dataset and a knowledge base tends to produce better label accuracy. Third, combining multiple knowledge bases does not tend to improve performance beyond using the closest-domain knowledge base. Finally, we show that the claim-checking pipeline’s confidence score for selecting evidence can be used to assess whether a knowledge base will perform well for a new set of claims, even in the absence of ground-truth labels.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"40 1","pages":"1 - 22"},"PeriodicalIF":1.5000,"publicationDate":"2023-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Journal of Data and Information Quality","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3561389","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 1

Abstract

Automated claim checking is the task of determining the veracity of a claim given evidence retrieved from a textual knowledge base of trustworthy facts. While previous work has taken the knowledge base as given and optimized the claim-checking pipeline, we take the opposite approach—taking the pipeline as given, we explore the choice of the knowledge base. Our first insight is that a claim-checking pipeline can be transferred to a new domain of claims with access to a knowledge base from the new domain. Second, we do not find a “universally best” knowledge base—higher domain overlap of a task dataset and a knowledge base tends to produce better label accuracy. Third, combining multiple knowledge bases does not tend to improve performance beyond using the closest-domain knowledge base. Finally, we show that the claim-checking pipeline’s confidence score for selecting evidence can be used to assess whether a knowledge base will perform well for a new set of claims, even in the absence of ground-truth labels.
自动索赔检查中文本知识库的选择
自动索赔检查的任务是从可信事实的文本知识库中检索证据,确定索赔的真实性。先前的工作是将知识库作为给定的,并对索赔检查管道进行优化,而我们采取相反的方法——将管道作为给定的,我们探索知识库的选择。我们的第一个见解是,索赔检查管道可以被转移到一个新的索赔领域,并从新领域访问知识库。其次,我们没有找到一个“普遍最佳”的知识库-任务数据集和知识库的高域重叠往往会产生更好的标签准确性。第三,除了使用最接近领域的知识库之外,组合多个知识库并不倾向于提高性能。最后,我们证明了索赔检查管道选择证据的置信度得分可以用来评估知识库是否会在一组新的索赔中表现良好,即使在没有基本事实标签的情况下。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
ACM Journal of Data and Information Quality
ACM Journal of Data and Information Quality COMPUTER SCIENCE, INFORMATION SYSTEMS-
CiteScore
4.10
自引率
4.80%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信