XAI 驱动的反病毒软件在碉堡恶意软件模式识别中的应用

IF 3.1 3区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Journal of Computational Science Pub Date : 2024-07-15 DOI:10.1016/j.jocs.2024.102389

Carlos Henrique Macedo dos Santos , Sidney Marlon Lopes de Lima

{"title":"XAI 驱动的反病毒软件在碉堡恶意软件模式识别中的应用","authors":"Carlos Henrique Macedo dos Santos , Sidney Marlon Lopes de Lima","doi":"10.1016/j.jocs.2024.102389","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective:</h3><p>The constant growth of invasions and information theft by using infected software has always been a problem. According to McAfee labs in 2020, on average, 480 new viruses are created each hour. The means of identifying such threats, categorizing and creating vaccines may not be that fast. Thanks to the increasing processing power and the popularity of artificial intelligence, it is now possible to integrate intelligence on an antivirus engine to enhance its protecting capabilities. And doing so with good algorithms and parameterization can be a key asset in securing one’s environment. In this work we analyze the overall performance of our antivirus and compare it with other state-of-art antiviruses.</p></div><div><h3>Methods:</h3><p>In this work, we create an extreme neural network which can perform quick training time and have satisfactory accuracy when classifying unknown files that may or may not be infected with Citadel. Our virus database is built with many examples of well-known infected files, and our results are compared with other intelligent antiviruses created by other companies and/or researchers.</p><p>The proposed technique stands out as a beneficial practice in terms of efficiency and interpretability; it achieves a very reduced number of neurons through its thorough pruning process. This reduction of dimensionality shrinks the input layer by 98%, enhancing not only data interpretation but also reducing the time required for training.</p></div><div><h3>Results:</h3><p>Our antivirus achieves an overall performance of 98.50% when distinguishing harmless and malicious portable executable (PE) programs. To enhance accuracy, we conducted tests under various initial conditions, learning functions, and architectures. Our successful results consumes only 0.19 s of training when using the complete training database and the response time is so immediate that the computer rounds it to 0.00 s.</p></div><div><h3>Conclusions:</h3><p>In this work, we conclude that mELM implementations are viable, and their performance can match state-of-the-art ones. It’s training and classification times are among the fastest of the algorithms tested, and the accuracy in detecting Citadel-infected PEs is acceptable.</p></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"82 ","pages":"Article 102389"},"PeriodicalIF":3.1000,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"XAI-driven antivirus in pattern identification of citadel malware\",\"authors\":\"Carlos Henrique Macedo dos Santos , Sidney Marlon Lopes de Lima\",\"doi\":\"10.1016/j.jocs.2024.102389\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background and Objective:</h3><p>The constant growth of invasions and information theft by using infected software has always been a problem. According to McAfee labs in 2020, on average, 480 new viruses are created each hour. The means of identifying such threats, categorizing and creating vaccines may not be that fast. Thanks to the increasing processing power and the popularity of artificial intelligence, it is now possible to integrate intelligence on an antivirus engine to enhance its protecting capabilities. And doing so with good algorithms and parameterization can be a key asset in securing one’s environment. In this work we analyze the overall performance of our antivirus and compare it with other state-of-art antiviruses.</p></div><div><h3>Methods:</h3><p>In this work, we create an extreme neural network which can perform quick training time and have satisfactory accuracy when classifying unknown files that may or may not be infected with Citadel. Our virus database is built with many examples of well-known infected files, and our results are compared with other intelligent antiviruses created by other companies and/or researchers.</p><p>The proposed technique stands out as a beneficial practice in terms of efficiency and interpretability; it achieves a very reduced number of neurons through its thorough pruning process. This reduction of dimensionality shrinks the input layer by 98%, enhancing not only data interpretation but also reducing the time required for training.</p></div><div><h3>Results:</h3><p>Our antivirus achieves an overall performance of 98.50% when distinguishing harmless and malicious portable executable (PE) programs. To enhance accuracy, we conducted tests under various initial conditions, learning functions, and architectures. Our successful results consumes only 0.19 s of training when using the complete training database and the response time is so immediate that the computer rounds it to 0.00 s.</p></div><div><h3>Conclusions:</h3><p>In this work, we conclude that mELM implementations are viable, and their performance can match state-of-the-art ones. It’s training and classification times are among the fastest of the algorithms tested, and the accuracy in detecting Citadel-infected PEs is acceptable.</p></div>\",\"PeriodicalId\":48907,\"journal\":{\"name\":\"Journal of Computational Science\",\"volume\":\"82 \",\"pages\":\"Article 102389\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computational Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1877750324001820\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Science","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1877750324001820","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

背景和目的：使用受感染软件入侵和窃取信息的情况不断增多，这一直是个问题。根据 McAfee 实验室 2020 年的数据，平均每小时会产生 480 种新的病毒。识别这些威胁、对其进行分类并制作疫苗的手段可能并没有那么快。由于处理能力的提高和人工智能的普及，现在可以在杀毒引擎上集成智能，以增强其保护能力。而通过良好的算法和参数化来实现这一点，可以成为保护环境安全的关键资产。在这项工作中，我们分析了我们的反病毒软件的整体性能，并将其与其他最先进的反病毒软件进行了比较。方法：在这项工作中，我们创建了一个极端神经网络，它可以执行快速训练，并在对可能感染或未感染 Citadel 的未知文件进行分类时具有令人满意的准确性。我们的病毒数据库包含了许多众所周知的感染文件实例，我们的结果与其他公司和/或研究人员创建的其他智能反病毒软件进行了比较。结果：我们的杀毒软件在区分无害和恶意的可移植可执行程序（PE）时，总体性能达到了 98.50%。为了提高准确性，我们在不同的初始条件、学习函数和架构下进行了测试。在使用完整的训练数据库时，我们的成功结果只消耗了 0.19 秒的训练时间，而且响应时间非常迅速，计算机将其舍入为 0.00 秒。它的训练和分类时间是所测试算法中最快的，检测受 Citadel 感染的 PE 的准确性也是可以接受的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

XAI-driven antivirus in pattern identification of citadel malware

Background and Objective:

The constant growth of invasions and information theft by using infected software has always been a problem. According to McAfee labs in 2020, on average, 480 new viruses are created each hour. The means of identifying such threats, categorizing and creating vaccines may not be that fast. Thanks to the increasing processing power and the popularity of artificial intelligence, it is now possible to integrate intelligence on an antivirus engine to enhance its protecting capabilities. And doing so with good algorithms and parameterization can be a key asset in securing one’s environment. In this work we analyze the overall performance of our antivirus and compare it with other state-of-art antiviruses.

Methods:

In this work, we create an extreme neural network which can perform quick training time and have satisfactory accuracy when classifying unknown files that may or may not be infected with Citadel. Our virus database is built with many examples of well-known infected files, and our results are compared with other intelligent antiviruses created by other companies and/or researchers.

The proposed technique stands out as a beneficial practice in terms of efficiency and interpretability; it achieves a very reduced number of neurons through its thorough pruning process. This reduction of dimensionality shrinks the input layer by 98%, enhancing not only data interpretation but also reducing the time required for training.

Results:

Our antivirus achieves an overall performance of 98.50% when distinguishing harmless and malicious portable executable (PE) programs. To enhance accuracy, we conducted tests under various initial conditions, learning functions, and architectures. Our successful results consumes only 0.19 s of training when using the complete training database and the response time is so immediate that the computer rounds it to 0.00 s.

Conclusions:

In this work, we conclude that mELM implementations are viable, and their performance can match state-of-the-art ones. It’s training and classification times are among the fastest of the algorithms tested, and the accuracy in detecting Citadel-infected PEs is acceptable.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Computational Science COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-COMPUTER SCIENCE, THEORY & METHODS

CiteScore

5.50

自引率

3.00%

发文量

227

审稿时长

41 days

期刊介绍： Computational Science is a rapidly growing multi- and interdisciplinary field that uses advanced computing and data analysis to understand and solve complex problems. It has reached a level of predictive capability that now firmly complements the traditional pillars of experimentation and theory. The recent advances in experimental techniques such as detectors, on-line sensor networks and high-resolution imaging techniques, have opened up new windows into physical and biological processes at many levels of detail. The resulting data explosion allows for detailed data driven modeling and simulation. This new discipline in science combines computational thinking, modern computational methods, devices and collateral technologies to address problems far beyond the scope of traditional numerical methods. Computational science typically unifies three distinct elements: • Modeling, Algorithms and Simulations (e.g. numerical and non-numerical, discrete and continuous); • Software developed to solve science (e.g., biological, physical, and social), engineering, medicine, and humanities problems; • Computer and information science that develops and optimizes the advanced system hardware, software, networking, and data management components (e.g. problem solving environments).