结合相似特征和深度表示学习的假新闻检测情境下的姿态检测

Luís Borges, Bruno Martins, P. Calado
{"title":"结合相似特征和深度表示学习的假新闻检测情境下的姿态检测","authors":"Luís Borges, Bruno Martins, P. Calado","doi":"10.1145/3287763","DOIUrl":null,"url":null,"abstract":"Fake news is nowadays an issue of pressing concern, given its recent rise as a potential threat to high-quality journalism and well-informed public discourse. The Fake News Challenge (FNC-1) was organized in early 2017 to encourage the development of machine-learning-based classification systems for stance detection (i.e., for identifying whether a particular news article agrees, disagrees, discusses, or is unrelated to a particular news headline), thus helping in the detection and analysis of possible instances of fake news. This article presents a novel approach to tackle this stance detection problem, based on the combination of string similarity features with a deep neural network architecture that leverages ideas previously advanced in the context of learning-efficient text representations, document classification, and natural language inference. Specifically, we use bi-directional Recurrent Neural Networks (RNNs), together with max-pooling over the temporal/sequential dimension and neural attention, for representing (i) the headline, (ii) the first two sentences of the news article, and (iii) the entire news article. These representations are then combined/compared, complemented with similarity features inspired on other FNC-1 approaches, and passed to a final layer that predicts the stance of the article toward the headline. We also explore the use of external sources of information, specifically large datasets of sentence pairs originally proposed for training and evaluating natural language inference methods to pre-train specific components of the neural network architecture (e.g., the RNNs used for encoding sentences). The obtained results attest to the effectiveness of the proposed ideas and show that our model, particularly when considering pre-training and the combination of neural representations together with similarity features, slightly outperforms the previous state of the art.","PeriodicalId":15582,"journal":{"name":"Journal of Data and Information Quality (JDIQ)","volume":"11 1","pages":"1 - 26"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"55","resultStr":"{\"title\":\"Combining Similarity Features and Deep Representation Learning for Stance Detection in the Context of Checking Fake News\",\"authors\":\"Luís Borges, Bruno Martins, P. Calado\",\"doi\":\"10.1145/3287763\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fake news is nowadays an issue of pressing concern, given its recent rise as a potential threat to high-quality journalism and well-informed public discourse. The Fake News Challenge (FNC-1) was organized in early 2017 to encourage the development of machine-learning-based classification systems for stance detection (i.e., for identifying whether a particular news article agrees, disagrees, discusses, or is unrelated to a particular news headline), thus helping in the detection and analysis of possible instances of fake news. This article presents a novel approach to tackle this stance detection problem, based on the combination of string similarity features with a deep neural network architecture that leverages ideas previously advanced in the context of learning-efficient text representations, document classification, and natural language inference. Specifically, we use bi-directional Recurrent Neural Networks (RNNs), together with max-pooling over the temporal/sequential dimension and neural attention, for representing (i) the headline, (ii) the first two sentences of the news article, and (iii) the entire news article. These representations are then combined/compared, complemented with similarity features inspired on other FNC-1 approaches, and passed to a final layer that predicts the stance of the article toward the headline. We also explore the use of external sources of information, specifically large datasets of sentence pairs originally proposed for training and evaluating natural language inference methods to pre-train specific components of the neural network architecture (e.g., the RNNs used for encoding sentences). The obtained results attest to the effectiveness of the proposed ideas and show that our model, particularly when considering pre-training and the combination of neural representations together with similarity features, slightly outperforms the previous state of the art.\",\"PeriodicalId\":15582,\"journal\":{\"name\":\"Journal of Data and Information Quality (JDIQ)\",\"volume\":\"11 1\",\"pages\":\"1 - 26\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"55\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Data and Information Quality (JDIQ)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3287763\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Data and Information Quality (JDIQ)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3287763","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 55

摘要

如今,假新闻是一个迫切需要关注的问题,因为它最近崛起,对高质量的新闻报道和见多识广的公共话语构成了潜在威胁。假新闻挑战赛(FNC-1)于2017年初组织,旨在鼓励开发基于机器学习的分类系统,用于立场检测(即用于识别特定新闻文章是否同意,不同意,讨论或与特定新闻标题无关),从而帮助检测和分析可能的假新闻实例。本文提出了一种解决该姿态检测问题的新方法,该方法基于字符串相似特征与深度神经网络体系结构的结合,该体系结构利用了以前在学习高效文本表示、文档分类和自然语言推理方面先进的思想。具体来说,我们使用双向循环神经网络(RNNs),以及时间/顺序维度上的最大池化和神经注意力,来表示(i)标题,(ii)新闻文章的前两句,以及(iii)整个新闻文章。然后将这些表示组合/比较,并与受其他FNC-1方法启发的相似性特征相补充,并传递到最后一层,该层预测文章对标题的立场。我们还探索了外部信息源的使用,特别是最初提出用于训练和评估自然语言推理方法的句子对的大型数据集,以预训练神经网络架构的特定组件(例如,用于编码句子的rnn)。获得的结果证明了所提出想法的有效性,并表明我们的模型,特别是在考虑预训练和神经表征与相似特征的组合时,略微优于以前的技术状态。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Combining Similarity Features and Deep Representation Learning for Stance Detection in the Context of Checking Fake News
Fake news is nowadays an issue of pressing concern, given its recent rise as a potential threat to high-quality journalism and well-informed public discourse. The Fake News Challenge (FNC-1) was organized in early 2017 to encourage the development of machine-learning-based classification systems for stance detection (i.e., for identifying whether a particular news article agrees, disagrees, discusses, or is unrelated to a particular news headline), thus helping in the detection and analysis of possible instances of fake news. This article presents a novel approach to tackle this stance detection problem, based on the combination of string similarity features with a deep neural network architecture that leverages ideas previously advanced in the context of learning-efficient text representations, document classification, and natural language inference. Specifically, we use bi-directional Recurrent Neural Networks (RNNs), together with max-pooling over the temporal/sequential dimension and neural attention, for representing (i) the headline, (ii) the first two sentences of the news article, and (iii) the entire news article. These representations are then combined/compared, complemented with similarity features inspired on other FNC-1 approaches, and passed to a final layer that predicts the stance of the article toward the headline. We also explore the use of external sources of information, specifically large datasets of sentence pairs originally proposed for training and evaluating natural language inference methods to pre-train specific components of the neural network architecture (e.g., the RNNs used for encoding sentences). The obtained results attest to the effectiveness of the proposed ideas and show that our model, particularly when considering pre-training and the combination of neural representations together with similarity features, slightly outperforms the previous state of the art.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信