Romanian Fake News Identification using Language Models

Andrei Preda, Stefan Ruseti, Simina Terian, M. Dascalu
{"title":"Romanian Fake News Identification using Language Models","authors":"Andrei Preda, Stefan Ruseti, Simina Terian, M. Dascalu","doi":"10.37789/rochi.2022.1.1.13","DOIUrl":null,"url":null,"abstract":"In an increasingly complex socio-economic and political context, the amount of fake news distributed online is on the rise and has already influenced major events and our decision-making capabilities. Studies show that people tend to be overconfident in their ability to identify fake news, which suggests that an automatic system for detecting them might be helpful. This article describes state-of-the-art techniques used in text classification and analyzes the performance of different neural networks on a corpus of news articles written in Romanian. Classical machine learning methods are considered, as well as more complex models based on Transformers, which achieved better results, having a weighted F1-score of .75 using RoBERT and CNN on top. Experiments with multi-task learning are also described but did not provide a boost in performance while reaching an F1-score of .74. We also introduce a prototype web application and additional use cases for automated fake news detection systems.","PeriodicalId":227396,"journal":{"name":"Romanian Conference on Human-Computer Interaction","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Romanian Conference on Human-Computer Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37789/rochi.2022.1.1.13","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

In an increasingly complex socio-economic and political context, the amount of fake news distributed online is on the rise and has already influenced major events and our decision-making capabilities. Studies show that people tend to be overconfident in their ability to identify fake news, which suggests that an automatic system for detecting them might be helpful. This article describes state-of-the-art techniques used in text classification and analyzes the performance of different neural networks on a corpus of news articles written in Romanian. Classical machine learning methods are considered, as well as more complex models based on Transformers, which achieved better results, having a weighted F1-score of .75 using RoBERT and CNN on top. Experiments with multi-task learning are also described but did not provide a boost in performance while reaching an F1-score of .74. We also introduce a prototype web application and additional use cases for automated fake news detection systems.
使用语言模型识别罗马尼亚假新闻
在日益复杂的社会经济和政治背景下,网上传播的假新闻数量正在上升,已经影响了重大事件和我们的决策能力。研究表明,人们往往对自己识别假新闻的能力过于自信,这表明,检测假新闻的自动系统可能会有所帮助。本文描述了用于文本分类的最先进技术,并分析了不同神经网络在罗马尼亚语新闻文章语料库上的性能。考虑了经典的机器学习方法,以及基于变形金刚的更复杂的模型,这些模型取得了更好的结果,使用RoBERT和CNN的加权f1得分为0.75。多任务学习的实验也有描述,但在达到f1 - 74分时,并没有提供表现上的提升。我们还介绍了一个原型web应用程序和自动假新闻检测系统的其他用例。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信