{"title":"Dataset for Arabic Fake News","authors":"Rasha Assaf, M. Saheb","doi":"10.1109/AICT52784.2021.9620228","DOIUrl":null,"url":null,"abstract":"the adaptation of social media platforms allows the fast spread of misinformation, which can mislead the public. This dissemination of information and usage of the internet enables users to create and share massive amounts of information, some of which are unreliable. Fake news has become an important social issue for researchers to tackle. Few English fake news datasets were published and numerous machine learning approaches were proposed for news reliability classification. However, up to now, there is a limited reliable Arabic dataset for fake news detection. This paper is a data paper in which we present a new dataset of Arabic fake news. The data was collected from various sources including PalKashif. The articles and news segments were labeled by two experts. The dataset contains about 500 news segments and the inter-annotator agreement measured using Cohen’s Kappa is 0.807. The dataset will be published for public use on Github 1.","PeriodicalId":150606,"journal":{"name":"2021 IEEE 15th International Conference on Application of Information and Communication Technologies (AICT)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 15th International Conference on Application of Information and Communication Technologies (AICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICT52784.2021.9620228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
the adaptation of social media platforms allows the fast spread of misinformation, which can mislead the public. This dissemination of information and usage of the internet enables users to create and share massive amounts of information, some of which are unreliable. Fake news has become an important social issue for researchers to tackle. Few English fake news datasets were published and numerous machine learning approaches were proposed for news reliability classification. However, up to now, there is a limited reliable Arabic dataset for fake news detection. This paper is a data paper in which we present a new dataset of Arabic fake news. The data was collected from various sources including PalKashif. The articles and news segments were labeled by two experts. The dataset contains about 500 news segments and the inter-annotator agreement measured using Cohen’s Kappa is 0.807. The dataset will be published for public use on Github 1.