Control flow change in assembly as a classifier in malware analysis

2016 4th International Symposium on Digital Forensic and Security (ISDFS) Pub Date : 2016-04-27 DOI:10.1109/ISDFS.2016.7473514

Andree Linke, Nhien-An Le-Khac

{"title":"Control flow change in assembly as a classifier in malware analysis","authors":"Andree Linke, Nhien-An Le-Khac","doi":"10.1109/ISDFS.2016.7473514","DOIUrl":null,"url":null,"abstract":"As currently classical malware detection methods based on signatures fail to detect new malware, they are not always efficient with new obfuscation techniques. Besides, new malware is easily created and old malware can be recoded to produce new one. Therefore, classical Antivirus becomes consistently less effective in dealing with those new threats. Also malware gets hand tailored to bypass network security and Antivirus. But as analysts do not have enough time to dissect suspected malware by hand, automated approaches have been developed. To cope with the mass of new malware, statistical and machine learning methods proved to be a good approach classilying programs, especially when using multiple approaches together to provide a likelihood of software being malicious. In normal approach, some steps have been taken, mostly by analyzing the opcodes or mnemonics of disassembly and their distribution. In this paper, we focus on the control flow change (CFC) itself and finding out if it is significant to detect malware. In the scope of this work only relative control flow changes are contemplated, as these are easier to extract from the first chosen disassembler library and are within a range of 256 addresses. These features are analyzed as a raw feature, as n-grams of length 2, 4 and 6 and the even more abstract feature of the occurrences of the n-grams is used. Statistical methods were used as well as the Naïve-Bayes algorithm to find out if there is significant data in CFC. We also test our approach with real-world datasets.","PeriodicalId":136977,"journal":{"name":"2016 4th International Symposium on Digital Forensic and Security (ISDFS)","volume":"1 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 4th International Symposium on Digital Forensic and Security (ISDFS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISDFS.2016.7473514","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

As currently classical malware detection methods based on signatures fail to detect new malware, they are not always efficient with new obfuscation techniques. Besides, new malware is easily created and old malware can be recoded to produce new one. Therefore, classical Antivirus becomes consistently less effective in dealing with those new threats. Also malware gets hand tailored to bypass network security and Antivirus. But as analysts do not have enough time to dissect suspected malware by hand, automated approaches have been developed. To cope with the mass of new malware, statistical and machine learning methods proved to be a good approach classilying programs, especially when using multiple approaches together to provide a likelihood of software being malicious. In normal approach, some steps have been taken, mostly by analyzing the opcodes or mnemonics of disassembly and their distribution. In this paper, we focus on the control flow change (CFC) itself and finding out if it is significant to detect malware. In the scope of this work only relative control flow changes are contemplated, as these are easier to extract from the first chosen disassembler library and are within a range of 256 addresses. These features are analyzed as a raw feature, as n-grams of length 2, 4 and 6 and the even more abstract feature of the occurrences of the n-grams is used. Statistical methods were used as well as the Naïve-Bayes algorithm to find out if there is significant data in CFC. We also test our approach with real-world datasets.

查看原文本刊更多论文

恶意软件分析中控制流变化的分类器

由于传统的基于签名的恶意软件检测方法无法检测到新的恶意软件，因此在使用新的混淆技术时，它们并不总是有效的。此外，新的恶意软件很容易创建，旧的恶意软件可以重新编码产生新的恶意软件。因此，传统的反病毒软件在处理这些新威胁时一直不那么有效。此外，恶意软件可以手工定制绕过网络安全和防病毒。但由于分析人员没有足够的时间手工分析可疑的恶意软件，因此开发了自动化方法。为了应对大量新的恶意软件，统计和机器学习方法被证明是对程序进行分类的好方法，特别是当使用多种方法一起提供软件恶意的可能性时。在通常的方法中，主要是通过分析反汇编的操作码或助记符及其分布来采取一些步骤。在本文中，我们主要关注控制流变化(CFC)本身，并找出它对检测恶意软件是否有意义。在这项工作的范围内，只考虑相对控制流的变化，因为这些更容易从第一个选择的反汇编程序库中提取，并且在256个地址范围内。这些特征作为原始特征进行分析，作为长度为2、4和6的n-gram，以及使用n-gram出现的更抽象的特征。使用统计方法和Naïve-Bayes算法来确定CFC中是否存在显著数据。我们还用真实世界的数据集测试了我们的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 4th International Symposium on Digital Forensic and Security (ISDFS)

自引率

0.00%

发文量