Sadia Nazim, Muhammad Mansoor Alam, Syed Safdar Rizvi, Jawahir Che Mustapha, Syed Shujaa Hussain, Mazliham Mohd Suud
{"title":"使用可解释的深度学习推进恶意软件图像分类:使用SHAP, LIME和Grad-CAM的最先进方法。","authors":"Sadia Nazim, Muhammad Mansoor Alam, Syed Safdar Rizvi, Jawahir Che Mustapha, Syed Shujaa Hussain, Mazliham Mohd Suud","doi":"10.1371/journal.pone.0318542","DOIUrl":null,"url":null,"abstract":"<p><p>Artificial Intelligence (AI) is being integrated into increasingly more domains of everyday activities. Whereas AI has countless benefits, its convoluted and sometimes vague internal operations can establish difficulties. Nowadays, AI is significantly employed for evaluations in cybersecurity that find it challenging to justify their proceedings; this absence of accountability is alarming. Additionally, over the last ten years, the fractional elevation in malware variants has directed scholars to utilize Machine Learning (ML) and Deep Learning (DL) approaches for detection. Although these methods yield exceptional accuracy, they are also difficult to understand. Thus, the advancement of interpretable and powerful AI models is indispensable to their reliability and trustworthiness. The trust of users in the models used for cybersecurity would be undermined by the ambiguous and indefinable nature of existing AI-based methods, specifically in light of the more complicated and diverse nature of cyberattacks in modern times. The present research addresses the comparative analysis of an ensemble deep neural network (DNNW) with different ensemble techniques like RUSBoost, Random Forest, Subspace, AdaBoost, and BagTree for the best prediction against imagery malware data. It determines the best-performing model, an ensemble DNNW, for which explainability is provided. There has been relatively little study on explainability, especially when dealing with malware imagery data, irrespective of the fact that DL/ML algorithms have revolutionized malware detection. Explainability techniques such as SHAP, LIME, and Grad-CAM approaches are employed to present a complete comprehension of feature significance and local or global predictive behavior of the model over various malware categories. A comprehensive investigation of significant characteristics and their impact on the decision-making process of the model and multiple query point visualizations are some of the contributions. This strategy promotes advanced transparency and trustworthy cybersecurity applications by improving the comprehension of malware detection techniques and integrating explainable AI observations with domain-specific knowledge.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"20 5","pages":"e0318542"},"PeriodicalIF":2.6000,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12118971/pdf/","citationCount":"0","resultStr":"{\"title\":\"Advancing malware imagery classification with explainable deep learning: A state-of-the-art approach using SHAP, LIME and Grad-CAM.\",\"authors\":\"Sadia Nazim, Muhammad Mansoor Alam, Syed Safdar Rizvi, Jawahir Che Mustapha, Syed Shujaa Hussain, Mazliham Mohd Suud\",\"doi\":\"10.1371/journal.pone.0318542\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Artificial Intelligence (AI) is being integrated into increasingly more domains of everyday activities. Whereas AI has countless benefits, its convoluted and sometimes vague internal operations can establish difficulties. Nowadays, AI is significantly employed for evaluations in cybersecurity that find it challenging to justify their proceedings; this absence of accountability is alarming. Additionally, over the last ten years, the fractional elevation in malware variants has directed scholars to utilize Machine Learning (ML) and Deep Learning (DL) approaches for detection. Although these methods yield exceptional accuracy, they are also difficult to understand. Thus, the advancement of interpretable and powerful AI models is indispensable to their reliability and trustworthiness. The trust of users in the models used for cybersecurity would be undermined by the ambiguous and indefinable nature of existing AI-based methods, specifically in light of the more complicated and diverse nature of cyberattacks in modern times. The present research addresses the comparative analysis of an ensemble deep neural network (DNNW) with different ensemble techniques like RUSBoost, Random Forest, Subspace, AdaBoost, and BagTree for the best prediction against imagery malware data. It determines the best-performing model, an ensemble DNNW, for which explainability is provided. There has been relatively little study on explainability, especially when dealing with malware imagery data, irrespective of the fact that DL/ML algorithms have revolutionized malware detection. Explainability techniques such as SHAP, LIME, and Grad-CAM approaches are employed to present a complete comprehension of feature significance and local or global predictive behavior of the model over various malware categories. A comprehensive investigation of significant characteristics and their impact on the decision-making process of the model and multiple query point visualizations are some of the contributions. This strategy promotes advanced transparency and trustworthy cybersecurity applications by improving the comprehension of malware detection techniques and integrating explainable AI observations with domain-specific knowledge.</p>\",\"PeriodicalId\":20189,\"journal\":{\"name\":\"PLoS ONE\",\"volume\":\"20 5\",\"pages\":\"e0318542\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-05-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12118971/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PLoS ONE\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1371/journal.pone.0318542\",\"RegionNum\":3,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0318542","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Advancing malware imagery classification with explainable deep learning: A state-of-the-art approach using SHAP, LIME and Grad-CAM.
Artificial Intelligence (AI) is being integrated into increasingly more domains of everyday activities. Whereas AI has countless benefits, its convoluted and sometimes vague internal operations can establish difficulties. Nowadays, AI is significantly employed for evaluations in cybersecurity that find it challenging to justify their proceedings; this absence of accountability is alarming. Additionally, over the last ten years, the fractional elevation in malware variants has directed scholars to utilize Machine Learning (ML) and Deep Learning (DL) approaches for detection. Although these methods yield exceptional accuracy, they are also difficult to understand. Thus, the advancement of interpretable and powerful AI models is indispensable to their reliability and trustworthiness. The trust of users in the models used for cybersecurity would be undermined by the ambiguous and indefinable nature of existing AI-based methods, specifically in light of the more complicated and diverse nature of cyberattacks in modern times. The present research addresses the comparative analysis of an ensemble deep neural network (DNNW) with different ensemble techniques like RUSBoost, Random Forest, Subspace, AdaBoost, and BagTree for the best prediction against imagery malware data. It determines the best-performing model, an ensemble DNNW, for which explainability is provided. There has been relatively little study on explainability, especially when dealing with malware imagery data, irrespective of the fact that DL/ML algorithms have revolutionized malware detection. Explainability techniques such as SHAP, LIME, and Grad-CAM approaches are employed to present a complete comprehension of feature significance and local or global predictive behavior of the model over various malware categories. A comprehensive investigation of significant characteristics and their impact on the decision-making process of the model and multiple query point visualizations are some of the contributions. This strategy promotes advanced transparency and trustworthy cybersecurity applications by improving the comprehension of malware detection techniques and integrating explainable AI observations with domain-specific knowledge.
期刊介绍:
PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides:
* Open-access—freely accessible online, authors retain copyright
* Fast publication times
* Peer review by expert, practicing researchers
* Post-publication tools to indicate quality and impact
* Community-based dialogue on articles
* Worldwide media coverage