Tahsinur Rahman, Nusaiba Ahmed, Shama Monjur, Fasbeer Mohammad Haque, Muhammad Iqbal Hossain
{"title":"使用XAI和SHAP框架解释PDF恶意软件检测的机器和深度学习模型","authors":"Tahsinur Rahman, Nusaiba Ahmed, Shama Monjur, Fasbeer Mohammad Haque, Muhammad Iqbal Hossain","doi":"10.1109/INOCON57975.2023.10101116","DOIUrl":null,"url":null,"abstract":"As the world progresses towards a digital era, the transfer of data in Portable Document Format (PDF) has become ubiquitous. Regrettably, this format is susceptible to malware attacks and the conventional anti-malware and anti-virus software may not be able to detect PDF malware effectively. In response to this problem, the implementation of machine learning algorithms and neural networks has been proposed in the past. However, the lack of transparency in these models raises concerns regarding their ethical and responsible decision-making. To address this concern, the utilization of Explainable AI (XAI) with the SHAP framework is proposed to classify PDF files as either malicious or clean, providing both a global and local understanding of the models’ decisions. The algorithms employed in this endeavor include Stochastic Gradient Descent (SGD), XGBoost Classifier, Single Layer Perceptron, and Artificial Neural Network (ANN).","PeriodicalId":113637,"journal":{"name":"2023 2nd International Conference for Innovation in Technology (INOCON)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Interpreting Machine and Deep Learning Models for PDF Malware Detection using XAI and SHAP Framework\",\"authors\":\"Tahsinur Rahman, Nusaiba Ahmed, Shama Monjur, Fasbeer Mohammad Haque, Muhammad Iqbal Hossain\",\"doi\":\"10.1109/INOCON57975.2023.10101116\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the world progresses towards a digital era, the transfer of data in Portable Document Format (PDF) has become ubiquitous. Regrettably, this format is susceptible to malware attacks and the conventional anti-malware and anti-virus software may not be able to detect PDF malware effectively. In response to this problem, the implementation of machine learning algorithms and neural networks has been proposed in the past. However, the lack of transparency in these models raises concerns regarding their ethical and responsible decision-making. To address this concern, the utilization of Explainable AI (XAI) with the SHAP framework is proposed to classify PDF files as either malicious or clean, providing both a global and local understanding of the models’ decisions. The algorithms employed in this endeavor include Stochastic Gradient Descent (SGD), XGBoost Classifier, Single Layer Perceptron, and Artificial Neural Network (ANN).\",\"PeriodicalId\":113637,\"journal\":{\"name\":\"2023 2nd International Conference for Innovation in Technology (INOCON)\",\"volume\":\"136 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 2nd International Conference for Innovation in Technology (INOCON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INOCON57975.2023.10101116\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 2nd International Conference for Innovation in Technology (INOCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INOCON57975.2023.10101116","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Interpreting Machine and Deep Learning Models for PDF Malware Detection using XAI and SHAP Framework
As the world progresses towards a digital era, the transfer of data in Portable Document Format (PDF) has become ubiquitous. Regrettably, this format is susceptible to malware attacks and the conventional anti-malware and anti-virus software may not be able to detect PDF malware effectively. In response to this problem, the implementation of machine learning algorithms and neural networks has been proposed in the past. However, the lack of transparency in these models raises concerns regarding their ethical and responsible decision-making. To address this concern, the utilization of Explainable AI (XAI) with the SHAP framework is proposed to classify PDF files as either malicious or clean, providing both a global and local understanding of the models’ decisions. The algorithms employed in this endeavor include Stochastic Gradient Descent (SGD), XGBoost Classifier, Single Layer Perceptron, and Artificial Neural Network (ANN).