Andrei Preda, Stefan Ruseti, Simina Terian, M. Dascalu
{"title":"Romanian Fake News Identification using Language Models","authors":"Andrei Preda, Stefan Ruseti, Simina Terian, M. Dascalu","doi":"10.37789/rochi.2022.1.1.13","DOIUrl":null,"url":null,"abstract":"In an increasingly complex socio-economic and political context, the amount of fake news distributed online is on the rise and has already influenced major events and our decision-making capabilities. Studies show that people tend to be overconfident in their ability to identify fake news, which suggests that an automatic system for detecting them might be helpful. This article describes state-of-the-art techniques used in text classification and analyzes the performance of different neural networks on a corpus of news articles written in Romanian. Classical machine learning methods are considered, as well as more complex models based on Transformers, which achieved better results, having a weighted F1-score of .75 using RoBERT and CNN on top. Experiments with multi-task learning are also described but did not provide a boost in performance while reaching an F1-score of .74. We also introduce a prototype web application and additional use cases for automated fake news detection systems.","PeriodicalId":227396,"journal":{"name":"Romanian Conference on Human-Computer Interaction","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Romanian Conference on Human-Computer Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37789/rochi.2022.1.1.13","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In an increasingly complex socio-economic and political context, the amount of fake news distributed online is on the rise and has already influenced major events and our decision-making capabilities. Studies show that people tend to be overconfident in their ability to identify fake news, which suggests that an automatic system for detecting them might be helpful. This article describes state-of-the-art techniques used in text classification and analyzes the performance of different neural networks on a corpus of news articles written in Romanian. Classical machine learning methods are considered, as well as more complex models based on Transformers, which achieved better results, having a weighted F1-score of .75 using RoBERT and CNN on top. Experiments with multi-task learning are also described but did not provide a boost in performance while reaching an F1-score of .74. We also introduce a prototype web application and additional use cases for automated fake news detection systems.