Le Duc Thuan, V. H. Pham, H. Hiep, Nguyen Kim Khanh
{"title":"Improvement of feature set based on Apriori algorithm in Android malware classification using machine learning method","authors":"Le Duc Thuan, V. H. Pham, H. Hiep, Nguyen Kim Khanh","doi":"10.1109/RIVF48685.2020.9140779","DOIUrl":null,"url":null,"abstract":"A well-constructed feature set plays an important role in accuracy improvement in malware detection. However, research and evaluation of the relations between features to acquire a good feature set have not been received much attention. In this work, a method based on Apriori algorithm was proposed to improve the feature set. The method studies association rules from the initial feature set to devise the highly correlated and informative features, which will be added to the initial set. The improved feature set will be evaluated via cross validation test using various machine learning algorithms, such as SVM, Random forest and CNN. The accuracy of the test reached is 96.49% with 96.71% improved compared with the test using initial set.","PeriodicalId":169999,"journal":{"name":"2020 RIVF International Conference on Computing and Communication Technologies (RIVF)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 RIVF International Conference on Computing and Communication Technologies (RIVF)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RIVF48685.2020.9140779","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
A well-constructed feature set plays an important role in accuracy improvement in malware detection. However, research and evaluation of the relations between features to acquire a good feature set have not been received much attention. In this work, a method based on Apriori algorithm was proposed to improve the feature set. The method studies association rules from the initial feature set to devise the highly correlated and informative features, which will be added to the initial set. The improved feature set will be evaluated via cross validation test using various machine learning algorithms, such as SVM, Random forest and CNN. The accuracy of the test reached is 96.49% with 96.71% improved compared with the test using initial set.