Malware classification using byte sequence information

Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems Pub Date : 2018-10-09 DOI:10.1145/3264746.3264775

Byungho Jung, Taeguen Kim, E. Im

{"title":"Malware classification using byte sequence information","authors":"Byungho Jung, Taeguen Kim, E. Im","doi":"10.1145/3264746.3264775","DOIUrl":null,"url":null,"abstract":"The number of new malware and new malware variants have been increasing continuously. Security experts analyze malware to capture the malicious properties of malware and to generate signatures or detection rules, but the analysis overheads keep increasing with the increasing number of malware. To analyze a large amount of malware, various kinds of automatic analysis methods are in need. Recently, deep learning techniques such as convolutional neural network (CNN) and recurrent neural network (RNN) have been applied for malware classifications. The features used in the previous approches are mostly based on API (Application Programming Interface) information, and the API invocation information can be obtained through dynamic analysis. However, the invocation information may not reflect malicious behaviors of malware because malware developers use various analysis avoidance techniques. Therefore, deep learning-based malware analysis using other features still need to be developed to improve malware analysis performance. In this paper, we propose a malware classification method using the deep learning algorithm based on byte information. Our proposed method uses images generated from malware byte information that can reflect malware behavioral context, and the convolutional neural network-based sentence analysis is used to process the generated images. We performed several experiments to show the effecitveness of our proposed method, and the experimental results show that our method showed higher accuracy than the naive CNN model, and the detection accuracy was about 99%.","PeriodicalId":186790,"journal":{"name":"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3264746.3264775","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

Abstract

The number of new malware and new malware variants have been increasing continuously. Security experts analyze malware to capture the malicious properties of malware and to generate signatures or detection rules, but the analysis overheads keep increasing with the increasing number of malware. To analyze a large amount of malware, various kinds of automatic analysis methods are in need. Recently, deep learning techniques such as convolutional neural network (CNN) and recurrent neural network (RNN) have been applied for malware classifications. The features used in the previous approches are mostly based on API (Application Programming Interface) information, and the API invocation information can be obtained through dynamic analysis. However, the invocation information may not reflect malicious behaviors of malware because malware developers use various analysis avoidance techniques. Therefore, deep learning-based malware analysis using other features still need to be developed to improve malware analysis performance. In this paper, we propose a malware classification method using the deep learning algorithm based on byte information. Our proposed method uses images generated from malware byte information that can reflect malware behavioral context, and the convolutional neural network-based sentence analysis is used to process the generated images. We performed several experiments to show the effecitveness of our proposed method, and the experimental results show that our method showed higher accuracy than the naive CNN model, and the detection accuracy was about 99%.

查看原文本刊更多论文

恶意软件分类使用字节序列信息

新的恶意软件和新的恶意软件变体的数量一直在不断增加。安全专家分析恶意软件是为了捕获恶意软件的恶意属性，并生成签名或检测规则，但随着恶意软件数量的增加，分析开销也在不断增加。为了分析大量的恶意软件，需要各种各样的自动分析方法。近年来，卷积神经网络(CNN)和递归神经网络(RNN)等深度学习技术已被应用于恶意软件分类。上述方法使用的特性大多基于API (Application Programming Interface)信息，通过动态分析获得API调用信息。但是，调用信息可能无法反映恶意软件的恶意行为，因为恶意软件开发人员使用各种分析避免技术。因此，仍然需要开发基于深度学习的恶意软件分析，利用其他特征来提高恶意软件分析性能。本文提出了一种基于字节信息的深度学习恶意软件分类方法。我们提出的方法使用恶意软件字节信息生成的图像来反映恶意软件的行为背景，并使用基于卷积神经网络的句子分析来处理生成的图像。我们进行了多次实验来证明我们提出的方法的有效性，实验结果表明，我们的方法比朴素的CNN模型具有更高的准确率，检测准确率约为99%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems

自引率

0.00%

发文量