{"title":"Preprocessing of radicalism dataset to predict radical content in Indonesia","authors":"M. Subhan, Amang Sudarsono, A. Barakbah","doi":"10.1109/KCIC.2017.8228598","DOIUrl":null,"url":null,"abstract":"A radical definition according to procedural meanings is content that invites, provokes, performs certain acts, interprets jihad as a suicide bomb. And interpret the jihad is limited. In Indonesia, the radical content is often associated with content issues such Tribe, Religion, and Race. The classification of radical content is a challenging technical problem due to its large numbers, unstructured, and a lot of noise. The larger the amount of content it will produce more and more features. So that impact on the high dimensions and can lead to poor performance against the classification algorithm. How to solve the problem is dimensional reduction such as feature selection. In this study, we propose an approach to select features that are categorized radically and not radically using Human Brain and DF-Threshold. Prior to feature selection, preprocessing is performed, then text mining, then selection of features using Human Brain and DF-Threshold. Testing is done through 10-cross validation with k-Nearest Neighbor (k-NN) as its classification. Based on these trials we get the highest accuracy performance results of 66.37% with k on k-NN equal to 7.","PeriodicalId":117148,"journal":{"name":"2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KCIC.2017.8228598","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
A radical definition according to procedural meanings is content that invites, provokes, performs certain acts, interprets jihad as a suicide bomb. And interpret the jihad is limited. In Indonesia, the radical content is often associated with content issues such Tribe, Religion, and Race. The classification of radical content is a challenging technical problem due to its large numbers, unstructured, and a lot of noise. The larger the amount of content it will produce more and more features. So that impact on the high dimensions and can lead to poor performance against the classification algorithm. How to solve the problem is dimensional reduction such as feature selection. In this study, we propose an approach to select features that are categorized radically and not radically using Human Brain and DF-Threshold. Prior to feature selection, preprocessing is performed, then text mining, then selection of features using Human Brain and DF-Threshold. Testing is done through 10-cross validation with k-Nearest Neighbor (k-NN) as its classification. Based on these trials we get the highest accuracy performance results of 66.37% with k on k-NN equal to 7.