Romanian Fake News Identification using Language Models

Romanian Conference on Human-Computer Interaction Pub Date : 1900-01-01 DOI:10.37789/rochi.2022.1.1.13

Andrei Preda, Stefan Ruseti, Simina Terian, M. Dascalu

引用次数: 2

Abstract

In an increasingly complex socio-economic and political context, the amount of fake news distributed online is on the rise and has already influenced major events and our decision-making capabilities. Studies show that people tend to be overconfident in their ability to identify fake news, which suggests that an automatic system for detecting them might be helpful. This article describes state-of-the-art techniques used in text classification and analyzes the performance of different neural networks on a corpus of news articles written in Romanian. Classical machine learning methods are considered, as well as more complex models based on Transformers, which achieved better results, having a weighted F1-score of .75 using RoBERT and CNN on top. Experiments with multi-task learning are also described but did not provide a boost in performance while reaching an F1-score of .74. We also introduce a prototype web application and additional use cases for automated fake news detection systems.

查看原文本刊更多论文

使用语言模型识别罗马尼亚假新闻

在日益复杂的社会经济和政治背景下，网上传播的假新闻数量正在上升，已经影响了重大事件和我们的决策能力。研究表明，人们往往对自己识别假新闻的能力过于自信，这表明，检测假新闻的自动系统可能会有所帮助。本文描述了用于文本分类的最先进技术，并分析了不同神经网络在罗马尼亚语新闻文章语料库上的性能。考虑了经典的机器学习方法，以及基于变形金刚的更复杂的模型，这些模型取得了更好的结果，使用RoBERT和CNN的加权f1得分为0.75。多任务学习的实验也有描述，但在达到f1 - 74分时，并没有提供表现上的提升。我们还介绍了一个原型web应用程序和自动假新闻检测系统的其他用例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Romanian Conference on Human-Computer Interaction

自引率

0.00%

发文量