{"title":"Detecting Arabic YouTube Spam Using Data Mining Techniques","authors":"Yahya M. Tashtoush, Areen Magableh, Omar Darwish, Lujain Smadi, Omar Alomari, Anood ALghazoo","doi":"10.1109/ISDFS55398.2022.9800840","DOIUrl":null,"url":null,"abstract":"Since YouTube became one of the sources of income, the number of users has increased significantly and the number of spammers who aim to spread viruses or to promote their videos and channels. These behaviors have led many YouTubers to close their channels or to disable the comments because YouTube does not have enough tools to prevent it. Filtering Arabic spam comments is a big challenge at all according to various dialects which hold a huge number of synonyms. In this work, we have classified these comments using different algorithms such as Decision Tree(DT), Support Vector Machine (SVM), Naive Bayes(NB), Random Forest, and k-Nearest Neighbor (k-NN).","PeriodicalId":114335,"journal":{"name":"2022 10th International Symposium on Digital Forensics and Security (ISDFS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 10th International Symposium on Digital Forensics and Security (ISDFS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISDFS55398.2022.9800840","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Since YouTube became one of the sources of income, the number of users has increased significantly and the number of spammers who aim to spread viruses or to promote their videos and channels. These behaviors have led many YouTubers to close their channels or to disable the comments because YouTube does not have enough tools to prevent it. Filtering Arabic spam comments is a big challenge at all according to various dialects which hold a huge number of synonyms. In this work, we have classified these comments using different algorithms such as Decision Tree(DT), Support Vector Machine (SVM), Naive Bayes(NB), Random Forest, and k-Nearest Neighbor (k-NN).