An Empirical Evaluation of Automated Machine Learning Techniques for Malware Detection

Proceedings of the 2021 ACM Workshop on Security and Privacy Analytics Pub Date : 2021-04-28 DOI:10.1145/3445970.3451155

P. P. Kundu, Lux Anatharaman, Tram Truong-Huu

{"title":"An Empirical Evaluation of Automated Machine Learning Techniques for Malware Detection","authors":"P. P. Kundu, Lux Anatharaman, Tram Truong-Huu","doi":"10.1145/3445970.3451155","DOIUrl":null,"url":null,"abstract":"Nowadays, it is increasingly difficult even for a machine learning expert to incorporate all of the recent best practices into their modeling due to the fast development of state-of-the-art machine learning techniques. For the applications that handle big data sets, the complexity of the problem of choosing the best performing model with the best hyper-parameter setting becomes harder. In this work, we present an empirical evaluation of automated machine learning (AutoML) frameworks or techniques that aim to optimize hyper-parameters for machine learning models to achieve the best achievable performance. We apply AutoML techniques to the malware detection problem, which requires achieving the true positive rate as high as possible while reducing the false positive rate as low as possible. We adopt two AutoML frameworks, namely AutoGluon-Tabular and Microsoft Neural Network Intelligence (NNI) to optimize hyper-parameters of a Light Gradient Boosted Machine (LightGBM) model for classifying malware samples. We carry out extensive experiments on two data sets. The first data set is a publicly available data set (EMBER data set), that has been used as a benchmarking data set for many malware detection works. The second data set is a private data set we have acquired from a security company that provides recently-collected malware samples. We provide empirical analysis and performance comparison of the two AutoML frameworks. The experimental results show that AutoML frameworks could identify the set of hyper-parameters that significantly outperform the performance of the model with the known best performing hyper-parameter setting and improve the performance of a LightGBM classifier with respect to the true positive rate from $86.8%$ to $90%$ at $0.1%$ of false positive rate on EMBER data set and from $80.8%$ to $87.4%$ on the private data set.","PeriodicalId":117291,"journal":{"name":"Proceedings of the 2021 ACM Workshop on Security and Privacy Analytics","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 ACM Workshop on Security and Privacy Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3445970.3451155","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Nowadays, it is increasingly difficult even for a machine learning expert to incorporate all of the recent best practices into their modeling due to the fast development of state-of-the-art machine learning techniques. For the applications that handle big data sets, the complexity of the problem of choosing the best performing model with the best hyper-parameter setting becomes harder. In this work, we present an empirical evaluation of automated machine learning (AutoML) frameworks or techniques that aim to optimize hyper-parameters for machine learning models to achieve the best achievable performance. We apply AutoML techniques to the malware detection problem, which requires achieving the true positive rate as high as possible while reducing the false positive rate as low as possible. We adopt two AutoML frameworks, namely AutoGluon-Tabular and Microsoft Neural Network Intelligence (NNI) to optimize hyper-parameters of a Light Gradient Boosted Machine (LightGBM) model for classifying malware samples. We carry out extensive experiments on two data sets. The first data set is a publicly available data set (EMBER data set), that has been used as a benchmarking data set for many malware detection works. The second data set is a private data set we have acquired from a security company that provides recently-collected malware samples. We provide empirical analysis and performance comparison of the two AutoML frameworks. The experimental results show that AutoML frameworks could identify the set of hyper-parameters that significantly outperform the performance of the model with the known best performing hyper-parameter setting and improve the performance of a LightGBM classifier with respect to the true positive rate from $86.8%$ to $90%$ at $0.1%$ of false positive rate on EMBER data set and from $80.8%$ to $87.4%$ on the private data set.

查看原文本刊更多论文

用于恶意软件检测的自动机器学习技术的经验评估

如今，由于最先进的机器学习技术的快速发展，即使是机器学习专家也越来越难以将所有最新的最佳实践纳入他们的建模中。对于处理大数据集的应用程序，选择具有最佳超参数设置的最佳表现模型的问题变得更加复杂。在这项工作中，我们提出了自动化机器学习(AutoML)框架或技术的经验评估，旨在优化机器学习模型的超参数，以实现最佳的可实现性能。我们将AutoML技术应用于恶意软件检测问题，该问题要求实现尽可能高的真阳性率，同时尽可能低的假阳性率。我们采用AutoGluon-Tabular和Microsoft Neural Network Intelligence (NNI)两个AutoML框架对Light Gradient boosting Machine (LightGBM)模型的超参数进行优化，用于恶意软件样本分类。我们在两个数据集上进行了广泛的实验。第一个数据集是一个公开可用的数据集(EMBER数据集)，它已被用作许多恶意软件检测工作的基准数据集。第二个数据集是我们从一家提供最近收集的恶意软件样本的安全公司获得的私人数据集。我们对两种AutoML框架进行了实证分析和性能比较。实验结果表明，AutoML框架可以识别出明显优于已知最佳超参数设置模型的超参数集，并将LightGBM分类器的性能从EMBER数据集的真阳性率从86.8%提高到90%，假阳性率为0.1%，在私有数据集上从80.8%提高到87.4%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2021 ACM Workshop on Security and Privacy Analytics

自引率

0.00%

发文量