Mining Malware to Detect Variants

2014 Fifth Cybercrime and Trustworthy Computing Conference Pub Date : 2014-11-24 DOI:10.1109/CTC.2014.11

A. Azab, R. Layton, M. Alazab, Jonathan J. Oliver

{"title":"Mining Malware to Detect Variants","authors":"A. Azab, R. Layton, M. Alazab, Jonathan J. Oliver","doi":"10.1109/CTC.2014.11","DOIUrl":null,"url":null,"abstract":"Cybercrime continues to be a growing challenge and malware is one of the most serious security threats on the Internet today which have been in existence from the very early days. Cyber criminals continue to develop and advance their malicious attacks. Unfortunately, existing techniques for detecting malware and analysing code samples are insufficient and have significant limitations. For example, most of malware detection studies focused only on detection and neglected the variants of the code. Investigating malware variants allows antivirus products and governments to more easily detect these new attacks, attribution, predict such or similar attacks in the future, and further analysis. The focus of this paper is performing similarity measures between different malware binaries for the same variant utilizing data mining concepts in conjunction with hashing algorithms. In this paper, we investigate and evaluate using the Trend Locality Sensitive Hashing (TLSH) algorithm to group binaries that belong to the same variant together, utilizing the k-NN algorithm. Two Zeus variants were tested, TSPY_ZBOT and MAL_ZBOT to address the effectiveness of the proposed approach. We compare TLSH to related hashing methods (SSDEEP, SDHASH and NILSIMSA) that are currently used for this purpose. Experimental evaluation demonstrates that our method can effectively detect variants of malware and resilient to common obfuscations used by cyber criminals. Our results show that TLSH and SDHASH provide the highest accuracy results in scoring an F-measure of 0.989 and 0.999 respectively.","PeriodicalId":213064,"journal":{"name":"2014 Fifth Cybercrime and Trustworthy Computing Conference","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 Fifth Cybercrime and Trustworthy Computing Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CTC.2014.11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 49

Abstract

Cybercrime continues to be a growing challenge and malware is one of the most serious security threats on the Internet today which have been in existence from the very early days. Cyber criminals continue to develop and advance their malicious attacks. Unfortunately, existing techniques for detecting malware and analysing code samples are insufficient and have significant limitations. For example, most of malware detection studies focused only on detection and neglected the variants of the code. Investigating malware variants allows antivirus products and governments to more easily detect these new attacks, attribution, predict such or similar attacks in the future, and further analysis. The focus of this paper is performing similarity measures between different malware binaries for the same variant utilizing data mining concepts in conjunction with hashing algorithms. In this paper, we investigate and evaluate using the Trend Locality Sensitive Hashing (TLSH) algorithm to group binaries that belong to the same variant together, utilizing the k-NN algorithm. Two Zeus variants were tested, TSPY_ZBOT and MAL_ZBOT to address the effectiveness of the proposed approach. We compare TLSH to related hashing methods (SSDEEP, SDHASH and NILSIMSA) that are currently used for this purpose. Experimental evaluation demonstrates that our method can effectively detect variants of malware and resilient to common obfuscations used by cyber criminals. Our results show that TLSH and SDHASH provide the highest accuracy results in scoring an F-measure of 0.989 and 0.999 respectively.

查看原文本刊更多论文

挖掘恶意软件检测变体

网络犯罪仍然是一个日益增长的挑战，恶意软件是当今互联网上最严重的安全威胁之一，从很早的时候就存在。网络犯罪分子继续发展和推进他们的恶意攻击。不幸的是，现有的检测恶意软件和分析代码样本的技术是不够的，并且有很大的局限性。例如，大多数恶意软件检测研究只关注检测，而忽略了代码的变体。调查恶意软件变体可以让反病毒产品和政府更容易地检测到这些新的攻击、归因、预测未来的此类或类似攻击，并进行进一步分析。本文的重点是利用数据挖掘概念和散列算法对相同变体的不同恶意软件二进制文件执行相似性度量。在本文中，我们研究和评估了使用趋势局域敏感哈希(TLSH)算法，利用k-NN算法将属于同一变体的二进制文件分组在一起。两种宙斯型被测试，TSPY_ZBOT和MAL_ZBOT来处理所提议方法的有效性。我们将TLSH与目前用于此目的的相关散列方法(SSDEEP, SDHASH和NILSIMSA)进行比较。实验评估表明，我们的方法可以有效地检测恶意软件的变体，并对网络犯罪分子使用的常见混淆具有弹性。结果表明，TLSH和SDHASH的f值分别为0.989和0.999，准确度最高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 Fifth Cybercrime and Trustworthy Computing Conference

自引率

0.00%

发文量