Fast and Efficient Malware Detection with Joint Static and Dynamic Features Through Transfer Learning

International Conference on Applied Cryptography and Network Security Pub Date : 2022-11-25 DOI:10.48550/arXiv.2211.13860

Mao V. Ngo, Tram Truong-Huu, Dima Rabadi, Jia Yi Loo, S. Teo

{"title":"Fast and Efficient Malware Detection with Joint Static and Dynamic Features Through Transfer Learning","authors":"Mao V. Ngo, Tram Truong-Huu, Dima Rabadi, Jia Yi Loo, S. Teo","doi":"10.48550/arXiv.2211.13860","DOIUrl":null,"url":null,"abstract":"In malware detection, dynamic analysis extracts the runtime behavior of malware samples in a controlled environment and static analysis extracts features using reverse engineering tools. While the former faces the challenges of anti-virtualization and evasive behavior of malware samples, the latter faces the challenges of code obfuscation. To tackle these drawbacks, prior works proposed to develop detection models by aggregating dynamic and static features, thus leveraging the advantages of both approaches. However, simply concatenating dynamic and static features raises an issue of imbalanced contribution due to the heterogeneous dimensions of feature vectors to the performance of malware detection models. Yet, dynamic analysis is a time-consuming task and requires a secure environment, leading to detection delays and high costs for maintaining the analysis infrastructure. In this paper, we first introduce a method of constructing aggregated features via concatenating latent features learned through deep learning with equally-contributed dimensions. We then develop a knowledge distillation technique to transfer knowledge learned from aggregated features by a teacher model to a student model trained only on static features and use the trained student model for the detection of new malware samples. We carry out extensive experiments with a dataset of 86709 samples including both benign and malware samples. The experimental results show that the teacher model trained on aggregated features constructed by our method outperforms the state-of-the-art models with an improvement of up to 2.38% in detection accuracy. The distilled student model not only achieves high performance (97.81% in terms of accuracy) as that of the teacher model but also significantly reduces the detection time (from 70046.6 ms to 194.9 ms) without requiring dynamic analysis.","PeriodicalId":412384,"journal":{"name":"International Conference on Applied Cryptography and Network Security","volume":"18 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Applied Cryptography and Network Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2211.13860","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In malware detection, dynamic analysis extracts the runtime behavior of malware samples in a controlled environment and static analysis extracts features using reverse engineering tools. While the former faces the challenges of anti-virtualization and evasive behavior of malware samples, the latter faces the challenges of code obfuscation. To tackle these drawbacks, prior works proposed to develop detection models by aggregating dynamic and static features, thus leveraging the advantages of both approaches. However, simply concatenating dynamic and static features raises an issue of imbalanced contribution due to the heterogeneous dimensions of feature vectors to the performance of malware detection models. Yet, dynamic analysis is a time-consuming task and requires a secure environment, leading to detection delays and high costs for maintaining the analysis infrastructure. In this paper, we first introduce a method of constructing aggregated features via concatenating latent features learned through deep learning with equally-contributed dimensions. We then develop a knowledge distillation technique to transfer knowledge learned from aggregated features by a teacher model to a student model trained only on static features and use the trained student model for the detection of new malware samples. We carry out extensive experiments with a dataset of 86709 samples including both benign and malware samples. The experimental results show that the teacher model trained on aggregated features constructed by our method outperforms the state-of-the-art models with an improvement of up to 2.38% in detection accuracy. The distilled student model not only achieves high performance (97.81% in terms of accuracy) as that of the teacher model but also significantly reduces the detection time (from 70046.6 ms to 194.9 ms) without requiring dynamic analysis.

查看原文本刊更多论文

基于迁移学习的静态与动态联合特征的快速高效恶意软件检测

在恶意软件检测中，动态分析在受控环境中提取恶意软件样本的运行时行为，静态分析使用逆向工程工具提取特征。前者面临反虚拟化和恶意软件样本规避行为的挑战，后者面临代码混淆的挑战。为了解决这些缺点，先前的研究提出通过聚合动态和静态特征来开发检测模型，从而利用这两种方法的优点。然而，简单地将动态和静态特征连接起来，由于特征向量的异构维度对恶意软件检测模型的性能产生了不平衡的贡献问题。然而，动态分析是一项耗时的任务，需要一个安全的环境，导致检测延迟和维护分析基础设施的高成本。在本文中，我们首先介绍了一种通过连接通过深度学习学习到的具有等贡献维度的潜在特征来构建聚合特征的方法。然后，我们开发了一种知识蒸馏技术，将教师模型从聚合特征中学习到的知识转移到仅在静态特征上训练的学生模型中，并使用训练好的学生模型来检测新的恶意软件样本。我们对86709个样本进行了广泛的实验，包括良性和恶意软件样本。实验结果表明，本文方法构建的基于聚合特征训练的教师模型的检测准确率提高了2.38%，优于目前最先进的模型。经过提炼的学生模型不仅达到了与教师模型相当的高性能(准确率为97.81%)，而且在不需要动态分析的情况下显著缩短了检测时间(从70046.6 ms减少到194.9 ms)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Applied Cryptography and Network Security

自引率

0.00%

发文量