{"title":"Data Augmentations to Improve BERT-based Detection of Covid-19 Fake News on Twitter","authors":"Feby Dahlan, S. Suyanto","doi":"10.1109/ICCoSITE57641.2023.10127796","DOIUrl":null,"url":null,"abstract":"Since Covid-19 has attacked the entire world, news about Covid-19 has been shared to reduce the impact of this outbreak. Social media, particularly Twitter, is a reliable source of information exchange. However, Covid-19 fake news is also being spread by irresponsible people to the public. This fact is so harmful to all parties. Hence, a fake news detector is required to tackle the problem. In this research, a Transformer-based fake news detection system is created. First, an architecture is designed using the Bidirectional Encoder Representations from Transformers (BERT). Three augmentation methods namely spell-checking-based, acronym-based, and typography-based augmentations are then developed to improve the BERT model. A comprehensive examination is performed based on 5-fold cross-validation using eleven thousand Twitter posts with four metrics: Accuracy, Precision, Recall, and F1-Score. Experimental results indicate that those three proposed augmentation methods can increase the BERT's performance detecting fake news related to Covid-19. The acronym-based augmentation gives a low improvement. Next, the spell-checking-based one provides a medium enhancement. Finally, the typography-based one offers the most significant improvement.","PeriodicalId":256184,"journal":{"name":"2023 International Conference on Computer Science, Information Technology and Engineering (ICCoSITE)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Computer Science, Information Technology and Engineering (ICCoSITE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCoSITE57641.2023.10127796","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Since Covid-19 has attacked the entire world, news about Covid-19 has been shared to reduce the impact of this outbreak. Social media, particularly Twitter, is a reliable source of information exchange. However, Covid-19 fake news is also being spread by irresponsible people to the public. This fact is so harmful to all parties. Hence, a fake news detector is required to tackle the problem. In this research, a Transformer-based fake news detection system is created. First, an architecture is designed using the Bidirectional Encoder Representations from Transformers (BERT). Three augmentation methods namely spell-checking-based, acronym-based, and typography-based augmentations are then developed to improve the BERT model. A comprehensive examination is performed based on 5-fold cross-validation using eleven thousand Twitter posts with four metrics: Accuracy, Precision, Recall, and F1-Score. Experimental results indicate that those three proposed augmentation methods can increase the BERT's performance detecting fake news related to Covid-19. The acronym-based augmentation gives a low improvement. Next, the spell-checking-based one provides a medium enhancement. Finally, the typography-based one offers the most significant improvement.