{"title":"Discovering Future Malware Variants By Generating New Malware Samples Using Generative Adversarial Network","authors":"Zahra Moti, S. Hashemi, Amir Namavar","doi":"10.1109/ICCKE48569.2019.8964913","DOIUrl":null,"url":null,"abstract":"Detecting malware sample is one of the most important issues in computer security. Malware variants are growing exponentially by more usage of computer in industries, homes, and other places. Among different types of malware samples, zero-day samples are more challenging. The conventional antivirus systems, which rely on known malware patterns, cannot detect zero-day samples since did not see them before. As reported in [1], in 2018, 76% of successful attacks on organization endpoints were based on zero-day samples. Therefore, predicting these types of attacks and preparing a solution is an open challenge.This paper presents a deep generative adversarial network to generate the signature of unseen malware samples; The generated signature is potentially similar to the malware samples that may be released in the future. After generating the samples, these generated data were added to the dataset to train a robust classifier against new variants of malware. Also, neural network is applied for extracting high-level features from raw bytes for detection. In the proposed method, only the header of the executable file was used for detection, which is a small piece of the file that contains some information about the file. To validate our method, we used three classification algorithms and classified the raw and new representation using them. Also, we compared our work with another malware detection using the PE header. The results of this paper show that the generated data improves the accuracy of classification algorithms by at least 1%.","PeriodicalId":6685,"journal":{"name":"2019 9th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"20 1","pages":"319-324"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 9th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE48569.2019.8964913","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Detecting malware sample is one of the most important issues in computer security. Malware variants are growing exponentially by more usage of computer in industries, homes, and other places. Among different types of malware samples, zero-day samples are more challenging. The conventional antivirus systems, which rely on known malware patterns, cannot detect zero-day samples since did not see them before. As reported in [1], in 2018, 76% of successful attacks on organization endpoints were based on zero-day samples. Therefore, predicting these types of attacks and preparing a solution is an open challenge.This paper presents a deep generative adversarial network to generate the signature of unseen malware samples; The generated signature is potentially similar to the malware samples that may be released in the future. After generating the samples, these generated data were added to the dataset to train a robust classifier against new variants of malware. Also, neural network is applied for extracting high-level features from raw bytes for detection. In the proposed method, only the header of the executable file was used for detection, which is a small piece of the file that contains some information about the file. To validate our method, we used three classification algorithms and classified the raw and new representation using them. Also, we compared our work with another malware detection using the PE header. The results of this paper show that the generated data improves the accuracy of classification algorithms by at least 1%.