Exploring the Effectiveness and Efficiency of LightGBM Algorithm for Windows Malware Detection

2022 5th Information Technology for Education and Development (ITED) Pub Date : 2022-11-01 DOI:10.1109/ITED56637.2022.10051488

M. Onoja, Abayomi Jegede, Jesse Mazadu, G. Aimufua, Ayodele Oyedele, Kolawole Olibodum

{"title":"Exploring the Effectiveness and Efficiency of LightGBM Algorithm for Windows Malware Detection","authors":"M. Onoja, Abayomi Jegede, Jesse Mazadu, G. Aimufua, Ayodele Oyedele, Kolawole Olibodum","doi":"10.1109/ITED56637.2022.10051488","DOIUrl":null,"url":null,"abstract":"Malware has posed a serious problem in today's world of cyber security. Effective malware detection approaches minimize damages caused by malware attack, while efficient detection strategies reduce the amount of resources required to detect malware. A previous application of LightGBM model to malware detection shows that the technique is suitable for Windows malware detection. However, the study did not compute the training time, detection time and classification accuracy of the model. There is need to evaluate the accuracy of LightGBM algorithm and determine the time required for training it. This is because quality training produces highly reliable model. It is also necessary to compute the classification accuracy and prediction time, to enhance better decision making. This paper applied the generic LightGBM algorithm on Windows malware to determine its efficiency and effectiveness in terms of training time, prediction time and classification accuracy. Performance evaluation based on the Malimg dataset shows a 99.80% training accuracy for binary class, while the accuracy for multi-class is 96.87%. The training time of the generic LightGBM is 179.51s for binary class and 2224.77s for multi-class. The classification accuracy showed a True Positive Rate (TPR) of 99% and False Positive Rate (FPR) of 0.99% for the binary classification, while the prediction time of the model are 0.08s and 0.40s for binary and multi class respectively. The results obtained for training time, detection time and classification accuracy show that LightGBM algorithm is suitable for detecting Windows malware.","PeriodicalId":246041,"journal":{"name":"2022 5th Information Technology for Education and Development (ITED)","volume":"155 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 5th Information Technology for Education and Development (ITED)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITED56637.2022.10051488","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Malware has posed a serious problem in today's world of cyber security. Effective malware detection approaches minimize damages caused by malware attack, while efficient detection strategies reduce the amount of resources required to detect malware. A previous application of LightGBM model to malware detection shows that the technique is suitable for Windows malware detection. However, the study did not compute the training time, detection time and classification accuracy of the model. There is need to evaluate the accuracy of LightGBM algorithm and determine the time required for training it. This is because quality training produces highly reliable model. It is also necessary to compute the classification accuracy and prediction time, to enhance better decision making. This paper applied the generic LightGBM algorithm on Windows malware to determine its efficiency and effectiveness in terms of training time, prediction time and classification accuracy. Performance evaluation based on the Malimg dataset shows a 99.80% training accuracy for binary class, while the accuracy for multi-class is 96.87%. The training time of the generic LightGBM is 179.51s for binary class and 2224.77s for multi-class. The classification accuracy showed a True Positive Rate (TPR) of 99% and False Positive Rate (FPR) of 0.99% for the binary classification, while the prediction time of the model are 0.08s and 0.40s for binary and multi class respectively. The results obtained for training time, detection time and classification accuracy show that LightGBM algorithm is suitable for detecting Windows malware.

查看原文本刊更多论文

探索LightGBM算法在Windows恶意软件检测中的有效性和效率

恶意软件已经成为当今世界网络安全的一个严重问题。有效的恶意软件检测方法可以最大限度地减少恶意软件攻击造成的损失，而高效的检测策略可以减少检测恶意软件所需的资源。LightGBM模型在恶意软件检测中的应用表明，该技术适用于Windows恶意软件检测。但是，本研究没有计算模型的训练时间、检测时间和分类准确率。需要评估LightGBM算法的准确性，并确定训练所需的时间。这是因为高质量的训练产生了高度可靠的模型。还需要计算分类精度和预测时间，以提高更好的决策。本文将通用的LightGBM算法应用于Windows恶意软件，从训练时间、预测时间和分类准确率三个方面来确定其效率和有效性。基于Malimg数据集的性能评估表明，二分类训练准确率为99.80%，多分类训练准确率为96.87%。通用LightGBM的二元类训练时间为179.51s，多类训练时间为2224.77s。分类准确率显示，二分类的真阳性率(TPR)为99%，假阳性率(FPR)为0.99%，二分类和多分类的预测时间分别为0.08s和0.40s。在训练时间、检测时间和分类精度方面的结果表明，LightGBM算法适用于检测Windows恶意软件。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 5th Information Technology for Education and Development (ITED)

自引率

0.00%

发文量