基于图表示的Android恶意软件动态分类

2016 IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud) Pub Date : 2016-06-25 DOI:10.1109/CSCloud.2016.27

Lifan Xu, D. Zhang, Marco A. Alvarez, J. Morales, Xudong Ma, John Cavazos

{"title":"基于图表示的Android恶意软件动态分类","authors":"Lifan Xu, D. Zhang, Marco A. Alvarez, J. Morales, Xudong Ma, John Cavazos","doi":"10.1109/CSCloud.2016.27","DOIUrl":null,"url":null,"abstract":"Malware classification for the Android ecosystem can be performed using a range of techniques. One major technique that has been gaining ground recently is dynamic analysis based on system call invocations recorded during the executions of Android applications. Dynamic analysis has traditionally been based on converting system calls into flat feature vectors and feeding the vectors into machine learning algorithms for classification. In this paper, we implement three traditional feature-vector-based representations for Android system calls. For each feature vector representation, we also propose a novel graph-based representation. We then use graph kernels to compute pair-wise similarities and feed these similarity measures into a Support Vector Machine (SVM) for classification. To speed up the graph kernel computation, we compress the graphs using the Compressed Row Storage format, and then we apply OpenMP to parallelize the computation. Experiments show that the graph-based representations are able to improve the classification accuracy over the corresponding feature-vector-based representations from the same input. Finally we show that different representations can be combined together to further improve classification accuracy.","PeriodicalId":410477,"journal":{"name":"2016 IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Dynamic Android Malware Classification Using Graph-Based Representations\",\"authors\":\"Lifan Xu, D. Zhang, Marco A. Alvarez, J. Morales, Xudong Ma, John Cavazos\",\"doi\":\"10.1109/CSCloud.2016.27\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Malware classification for the Android ecosystem can be performed using a range of techniques. One major technique that has been gaining ground recently is dynamic analysis based on system call invocations recorded during the executions of Android applications. Dynamic analysis has traditionally been based on converting system calls into flat feature vectors and feeding the vectors into machine learning algorithms for classification. In this paper, we implement three traditional feature-vector-based representations for Android system calls. For each feature vector representation, we also propose a novel graph-based representation. We then use graph kernels to compute pair-wise similarities and feed these similarity measures into a Support Vector Machine (SVM) for classification. To speed up the graph kernel computation, we compress the graphs using the Compressed Row Storage format, and then we apply OpenMP to parallelize the computation. Experiments show that the graph-based representations are able to improve the classification accuracy over the corresponding feature-vector-based representations from the same input. Finally we show that different representations can be combined together to further improve classification accuracy.\",\"PeriodicalId\":410477,\"journal\":{\"name\":\"2016 IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud)\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSCloud.2016.27\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCloud.2016.27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

摘要

Android生态系统的恶意软件分类可以使用一系列技术来执行。最近取得进展的一项主要技术是基于Android应用程序执行期间记录的系统调用调用的动态分析。动态分析传统上是基于将系统调用转换为平面特征向量，并将这些向量馈送到机器学习算法中进行分类。在本文中，我们为Android系统调用实现了三种传统的基于特征向量的表示。对于每个特征向量表示，我们还提出了一种新的基于图的表示。然后，我们使用图核计算成对相似度，并将这些相似度度量输入支持向量机(SVM)进行分类。为了加快图形内核的计算速度，我们使用压缩行存储格式压缩图形，然后应用OpenMP并行化计算。实验表明，基于图的表示比相同输入的相应基于特征向量的表示能够提高分类精度。最后，我们证明了不同的表示可以组合在一起进一步提高分类精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Dynamic Android Malware Classification Using Graph-Based Representations

Malware classification for the Android ecosystem can be performed using a range of techniques. One major technique that has been gaining ground recently is dynamic analysis based on system call invocations recorded during the executions of Android applications. Dynamic analysis has traditionally been based on converting system calls into flat feature vectors and feeding the vectors into machine learning algorithms for classification. In this paper, we implement three traditional feature-vector-based representations for Android system calls. For each feature vector representation, we also propose a novel graph-based representation. We then use graph kernels to compute pair-wise similarities and feed these similarity measures into a Support Vector Machine (SVM) for classification. To speed up the graph kernel computation, we compress the graphs using the Compressed Row Storage format, and then we apply OpenMP to parallelize the computation. Experiments show that the graph-based representations are able to improve the classification accuracy over the corresponding feature-vector-based representations from the same input. Finally we show that different representations can be combined together to further improve classification accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud)

自引率

0.00%

发文量