CrowdChecked: Detecting Previously Fact-Checked Claims in Social Media

Q3 Environmental Science
Momchil Hardalov, Anton Chernyavskiy, Ivan Koychev, Dmitry I. Ilvovsky, Preslav Nakov
{"title":"CrowdChecked: Detecting Previously Fact-Checked Claims in Social Media","authors":"Momchil Hardalov, Anton Chernyavskiy, Ivan Koychev, Dmitry I. Ilvovsky, Preslav Nakov","doi":"10.48550/arXiv.2210.04447","DOIUrl":null,"url":null,"abstract":"While there has been substantial progress in developing systems to automate fact-checking, they still lack credibility in the eyes of the users. Thus, an interesting approach has emerged: to perform automatic fact-checking by verifying whether an input claim has been previously fact-checked by professional fact-checkers and to return back an article that explains their decision. This is a sensible approach as people trust manual fact-checking, and as many claims are repeated multiple times. Yet, a major issue when building such systems is the small number of known tweet–verifying article pairs available for training. Here, we aim to bridge this gap by making use of crowd fact-checking, i.e., mining claims in social media for which users have responded with a link to a fact-checking article. In particular, we mine a large-scale collection of 330,000 tweets paired with a corresponding fact-checking article. We further propose an end-to-end framework to learn from this noisy data based on modified self-adaptive training, in a distant supervision scenario. Our experiments on the CLEF’21 CheckThat! test set show improvements over the state of the art by two points absolute. Our code and datasets are available at https://github.com/mhardalov/crowdchecked-claims","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"49 1","pages":"266-285"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AACL Bioflux","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2210.04447","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Environmental Science","Score":null,"Total":0}
引用次数: 4

Abstract

While there has been substantial progress in developing systems to automate fact-checking, they still lack credibility in the eyes of the users. Thus, an interesting approach has emerged: to perform automatic fact-checking by verifying whether an input claim has been previously fact-checked by professional fact-checkers and to return back an article that explains their decision. This is a sensible approach as people trust manual fact-checking, and as many claims are repeated multiple times. Yet, a major issue when building such systems is the small number of known tweet–verifying article pairs available for training. Here, we aim to bridge this gap by making use of crowd fact-checking, i.e., mining claims in social media for which users have responded with a link to a fact-checking article. In particular, we mine a large-scale collection of 330,000 tweets paired with a corresponding fact-checking article. We further propose an end-to-end framework to learn from this noisy data based on modified self-adaptive training, in a distant supervision scenario. Our experiments on the CLEF’21 CheckThat! test set show improvements over the state of the art by two points absolute. Our code and datasets are available at https://github.com/mhardalov/crowdchecked-claims
CrowdChecked:检测社交媒体中先前经过事实核查的言论
虽然在开发自动化事实核查系统方面取得了实质性进展,但在用户眼中,它们仍然缺乏可信度。因此,出现了一种有趣的方法:通过验证输入声明先前是否已由专业事实检查员进行事实检查来执行自动事实检查,并返回一篇解释其决定的文章。这是一种明智的方法,因为人们信任人工事实核查,而且许多说法被重复了多次。然而,在构建这样的系统时,一个主要问题是可供训练的已知推文验证文章对数量很少。在这里,我们的目标是通过使用群体事实核查来弥合这一差距,即在社交媒体上挖掘用户回复事实核查文章链接的声明。特别是,我们挖掘了33万条tweet的大规模集合,并与相应的事实核查文章配对。我们进一步提出了一个端到端框架,在远程监督场景中,基于改进的自适应训练从这些噪声数据中学习。我们在CLEF ' 21上的实验测试集显示,与目前的技术水平相比,进步了绝对两点。我们的代码和数据集可在https://github.com/mhardalov/crowdchecked-claims上获得
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
AACL Bioflux
AACL Bioflux Environmental Science-Management, Monitoring, Policy and Law
CiteScore
1.40
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信