{"title":"手机恶意软件发起的短信分类","authors":"Marián Kühnel, Ulrike Meyer","doi":"10.1109/ARES.2016.53","DOIUrl":null,"url":null,"abstract":"In this paper we show that supervised machine learning algorithms can reliably detect short messages initiated by mobile malware based on features derived from the content of short messages. In particular, we compare the detection capabilities of the classifiers Support Vector Machines, K-Nearest Neighbor, Decision Trees, Random Forests, and Multinomial Naive Bayes in three different evaluation scenarios. The first scenario is the standard k-fold cross validation, treating all short messages as independent from each other. In the second scenario, we evaluate, how the classifiers perform if only a certain portion of malware families are known during training. Here, we are able to show that training with only 50% of the the malware families already lead to an accuracy of over 90%. Finally, in the third scenario we evaluate the performance chronologically, i.e. the classifiers are trained with the short messages available at a certain point in time and tested on the newly arriving messages. Here, we show that classifiers can detect the majority of new short messages initiated by mobile malware even months after the training.","PeriodicalId":216417,"journal":{"name":"2016 11th International Conference on Availability, Reliability and Security (ARES)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Classification of Short Messages Initiated by Mobile Malware\",\"authors\":\"Marián Kühnel, Ulrike Meyer\",\"doi\":\"10.1109/ARES.2016.53\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we show that supervised machine learning algorithms can reliably detect short messages initiated by mobile malware based on features derived from the content of short messages. In particular, we compare the detection capabilities of the classifiers Support Vector Machines, K-Nearest Neighbor, Decision Trees, Random Forests, and Multinomial Naive Bayes in three different evaluation scenarios. The first scenario is the standard k-fold cross validation, treating all short messages as independent from each other. In the second scenario, we evaluate, how the classifiers perform if only a certain portion of malware families are known during training. Here, we are able to show that training with only 50% of the the malware families already lead to an accuracy of over 90%. Finally, in the third scenario we evaluate the performance chronologically, i.e. the classifiers are trained with the short messages available at a certain point in time and tested on the newly arriving messages. Here, we show that classifiers can detect the majority of new short messages initiated by mobile malware even months after the training.\",\"PeriodicalId\":216417,\"journal\":{\"name\":\"2016 11th International Conference on Availability, Reliability and Security (ARES)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 11th International Conference on Availability, Reliability and Security (ARES)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ARES.2016.53\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 11th International Conference on Availability, Reliability and Security (ARES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARES.2016.53","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Classification of Short Messages Initiated by Mobile Malware
In this paper we show that supervised machine learning algorithms can reliably detect short messages initiated by mobile malware based on features derived from the content of short messages. In particular, we compare the detection capabilities of the classifiers Support Vector Machines, K-Nearest Neighbor, Decision Trees, Random Forests, and Multinomial Naive Bayes in three different evaluation scenarios. The first scenario is the standard k-fold cross validation, treating all short messages as independent from each other. In the second scenario, we evaluate, how the classifiers perform if only a certain portion of malware families are known during training. Here, we are able to show that training with only 50% of the the malware families already lead to an accuracy of over 90%. Finally, in the third scenario we evaluate the performance chronologically, i.e. the classifiers are trained with the short messages available at a certain point in time and tested on the newly arriving messages. Here, we show that classifiers can detect the majority of new short messages initiated by mobile malware even months after the training.