DroidScribe:基于运行时行为分类Android恶意软件

2016 IEEE Security and Privacy Workshops (SPW) Pub Date : 2016-05-22 DOI:10.1109/SPW.2016.25

Santanu Kumar Dash, Guillermo Suarez-Tangil, Salahuddin J. Khan, K. Tam, Mansour Ahmadi, Johannes Kinder, L. Cavallaro

{"title":"DroidScribe:基于运行时行为分类Android恶意软件","authors":"Santanu Kumar Dash, Guillermo Suarez-Tangil, Salahuddin J. Khan, K. Tam, Mansour Ahmadi, Johannes Kinder, L. Cavallaro","doi":"10.1109/SPW.2016.25","DOIUrl":null,"url":null,"abstract":"The Android ecosystem has witnessed a surge in malware, which not only puts mobile devices at risk but also increases the burden on malware analysts assessing and categorizing threats. In this paper, we show how to use machine learning to automatically classify Android malware samples into families with high accuracy, while observing only their runtime behavior. We focus exclusively on dynamic analysis of runtime behavior to provide a clean point of comparison that is dual to static approaches. Specific challenges in the use of dynamic analysis on Android are the limited information gained from tracking low-level events and the imperfect coverage when testing apps, e.g., due to inactive command and control servers. We observe that on Android, pure system calls do not carry enough semantic content for classification and instead rely on lightweight virtual machine introspection to also reconstruct Android-level inter-process communication. To address the sparsity of data resulting from low coverage, we introduce a novel classification method that fuses Support Vector Machines with Conformal Prediction to generate high-accuracy prediction sets where the information is insufficient to pinpoint a single family.","PeriodicalId":341207,"journal":{"name":"2016 IEEE Security and Privacy Workshops (SPW)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"162","resultStr":"{\"title\":\"DroidScribe: Classifying Android Malware Based on Runtime Behavior\",\"authors\":\"Santanu Kumar Dash, Guillermo Suarez-Tangil, Salahuddin J. Khan, K. Tam, Mansour Ahmadi, Johannes Kinder, L. Cavallaro\",\"doi\":\"10.1109/SPW.2016.25\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Android ecosystem has witnessed a surge in malware, which not only puts mobile devices at risk but also increases the burden on malware analysts assessing and categorizing threats. In this paper, we show how to use machine learning to automatically classify Android malware samples into families with high accuracy, while observing only their runtime behavior. We focus exclusively on dynamic analysis of runtime behavior to provide a clean point of comparison that is dual to static approaches. Specific challenges in the use of dynamic analysis on Android are the limited information gained from tracking low-level events and the imperfect coverage when testing apps, e.g., due to inactive command and control servers. We observe that on Android, pure system calls do not carry enough semantic content for classification and instead rely on lightweight virtual machine introspection to also reconstruct Android-level inter-process communication. To address the sparsity of data resulting from low coverage, we introduce a novel classification method that fuses Support Vector Machines with Conformal Prediction to generate high-accuracy prediction sets where the information is insufficient to pinpoint a single family.\",\"PeriodicalId\":341207,\"journal\":{\"name\":\"2016 IEEE Security and Privacy Workshops (SPW)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"162\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Security and Privacy Workshops (SPW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPW.2016.25\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Security and Privacy Workshops (SPW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPW.2016.25","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 162

摘要

Android生态系统见证了恶意软件的激增，这不仅将移动设备置于危险之中，也增加了恶意软件分析师评估和分类威胁的负担。在本文中，我们展示了如何使用机器学习以高精度自动将Android恶意软件样本分类为家族，同时仅观察其运行时行为。我们只关注运行时行为的动态分析，以提供与静态方法相对应的清晰的比较点。在Android上使用动态分析的具体挑战是，从跟踪低级事件中获得的信息有限，以及在测试应用程序时，由于命令和控制服务器不活跃，覆盖范围不完善。我们观察到，在Android上，纯粹的系统调用没有携带足够的语义内容进行分类，而是依赖于轻量级虚拟机自省来重建Android级别的进程间通信。为了解决低覆盖率导致的数据稀疏性问题，我们引入了一种新的分类方法，该方法融合了支持向量机和保形预测，在信息不足以精确定位单个家庭的情况下生成高精度的预测集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DroidScribe: Classifying Android Malware Based on Runtime Behavior

The Android ecosystem has witnessed a surge in malware, which not only puts mobile devices at risk but also increases the burden on malware analysts assessing and categorizing threats. In this paper, we show how to use machine learning to automatically classify Android malware samples into families with high accuracy, while observing only their runtime behavior. We focus exclusively on dynamic analysis of runtime behavior to provide a clean point of comparison that is dual to static approaches. Specific challenges in the use of dynamic analysis on Android are the limited information gained from tracking low-level events and the imperfect coverage when testing apps, e.g., due to inactive command and control servers. We observe that on Android, pure system calls do not carry enough semantic content for classification and instead rely on lightweight virtual machine introspection to also reconstruct Android-level inter-process communication. To address the sparsity of data resulting from low coverage, we introduce a novel classification method that fuses Support Vector Machines with Conformal Prediction to generate high-accuracy prediction sets where the information is insufficient to pinpoint a single family.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE Security and Privacy Workshops (SPW)

自引率

0.00%

发文量