{"title":"Smartphone Malware Detection using Permissions and McNemar test","authors":"G. Kumari, Anshul Arora","doi":"10.1109/ICSCSS57650.2023.10169391","DOIUrl":null,"url":null,"abstract":"A recent report has shown that the availability of smartphones is increasing at an alarming rate and hence the number of mobile malware is exponentially increasing with the increase in popularity of smartphones. Looking at the level of threat from malware applications for Android users, it becomes essential to detect malware applications in a quick and effective way. One such way is to use permissions. To make an effective system for malware detection using permissions, a large dataset and different permissions are required to analyze the pattern. With a large number of permissions for analysis, the time of computation increases drastically. The time of computation can be reduced if the number of datasets or the number of permissions gets reduced. Reducing the number of features is preferred over decreasing the number of datasets. Further, the number of permissions can be rduced only if the permissions that are most distinguishing are selected by ignoring the permissions that don’t play a huge role in distinguishing between malware and benign applications. Thus, a novel method is required to rank the permissions based on how well that permission can be used to detect the nature of the application. This study introduces a statistical technique named McNemar test to find the correlation of a set of permissions with malware and benign applications and rank the permissions. The correlation gives a numerical value for the overlapping of each permission in malware and benign applications. The greater the correlation value lesser will be its usefulness in distinguishing the nature of the application. Such ranking helps us eliminate irrelevant permissions. This ranking can be further used for detection using various machine-learning algorithms. As a result, this study has narrowed down the total set of permissions from 129 to 38 and got 97% detection accuracy with the Random Forest classifier.","PeriodicalId":217957,"journal":{"name":"2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSCSS57650.2023.10169391","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A recent report has shown that the availability of smartphones is increasing at an alarming rate and hence the number of mobile malware is exponentially increasing with the increase in popularity of smartphones. Looking at the level of threat from malware applications for Android users, it becomes essential to detect malware applications in a quick and effective way. One such way is to use permissions. To make an effective system for malware detection using permissions, a large dataset and different permissions are required to analyze the pattern. With a large number of permissions for analysis, the time of computation increases drastically. The time of computation can be reduced if the number of datasets or the number of permissions gets reduced. Reducing the number of features is preferred over decreasing the number of datasets. Further, the number of permissions can be rduced only if the permissions that are most distinguishing are selected by ignoring the permissions that don’t play a huge role in distinguishing between malware and benign applications. Thus, a novel method is required to rank the permissions based on how well that permission can be used to detect the nature of the application. This study introduces a statistical technique named McNemar test to find the correlation of a set of permissions with malware and benign applications and rank the permissions. The correlation gives a numerical value for the overlapping of each permission in malware and benign applications. The greater the correlation value lesser will be its usefulness in distinguishing the nature of the application. Such ranking helps us eliminate irrelevant permissions. This ranking can be further used for detection using various machine-learning algorithms. As a result, this study has narrowed down the total set of permissions from 129 to 38 and got 97% detection accuracy with the Random Forest classifier.