{"title":"Detecting Indonesian Spammer on Twitter","authors":"E. B. Setiawan, D. H. Widyantoro, K. Surendro","doi":"10.1109/ICOICT.2018.8528773","DOIUrl":null,"url":null,"abstract":"Nowadays, Twitter is one of the most popular social media today. However, Twitter has several problems that have negative impacts to the users, one of which is spam. We introduce a different approach compared to previous research are the scope of Indonesian-language Twitter, crawling automatically for user and tweets data, as well as the addition of new features. We use two features dimension, i.e., user-based and tweet-based. In this paper, we detect Indonesian spammers on Twitter using four classification algorithms, namely Naïve Bayes (NB), Support Vector Machine (SVM), Logistic Regression (Logit), and J48. The results are confirmed for having better accuracy that of the existing. The highest accuracy of 93,67% is achieved using Logistic Regression (Logit).","PeriodicalId":266335,"journal":{"name":"2018 6th International Conference on Information and Communication Technology (ICoICT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 6th International Conference on Information and Communication Technology (ICoICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOICT.2018.8528773","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Nowadays, Twitter is one of the most popular social media today. However, Twitter has several problems that have negative impacts to the users, one of which is spam. We introduce a different approach compared to previous research are the scope of Indonesian-language Twitter, crawling automatically for user and tweets data, as well as the addition of new features. We use two features dimension, i.e., user-based and tweet-based. In this paper, we detect Indonesian spammers on Twitter using four classification algorithms, namely Naïve Bayes (NB), Support Vector Machine (SVM), Logistic Regression (Logit), and J48. The results are confirmed for having better accuracy that of the existing. The highest accuracy of 93,67% is achieved using Logistic Regression (Logit).