Ikram Ben Abdel Ouahab, Lotfi Elaachak, M. Bouhorma
{"title":"针对不平衡数据集使用成本敏感学习提高恶意软件分类器性能","authors":"Ikram Ben Abdel Ouahab, Lotfi Elaachak, M. Bouhorma","doi":"10.11591/ijai.v12.i4.pp1836-1844","DOIUrl":null,"url":null,"abstract":"In recent times, malware visualization has become very popular for malwareclassification in cybersecurity. Existing malware features can easily identifyknown malware that have been already detected, but they cannot identify newand infrequent malwares accurately. Moreover, deep learning algorithmsshow their power in term of malware classification topic. However, we foundthe use of imbalanced data; the Malimg database which contains 25 malwarefamilies don’t have same or near number of images per class. To address theseissues, this paper proposes an effective malware classifier, based on costsensitive deep learning. When performing classification on imbalanced data, some classes get less accuracy than others. Cost-sensitive is meant to solve this issue, however in our case of 25 classes, classical cost-sensitive weights wasn’t effective is giving equal attention to all classes. The proposed approach improves the performance of malware classification, and we demonstrate this improvement using two Convolutional Neural Network models using functional and subclassing programming techniques, based on loss, accuracy, recall and precision.","PeriodicalId":52221,"journal":{"name":"IAES International Journal of Artificial Intelligence","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improve malware classifiers performance using cost-sensitive learning for imbalanced dataset\",\"authors\":\"Ikram Ben Abdel Ouahab, Lotfi Elaachak, M. Bouhorma\",\"doi\":\"10.11591/ijai.v12.i4.pp1836-1844\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent times, malware visualization has become very popular for malwareclassification in cybersecurity. Existing malware features can easily identifyknown malware that have been already detected, but they cannot identify newand infrequent malwares accurately. Moreover, deep learning algorithmsshow their power in term of malware classification topic. However, we foundthe use of imbalanced data; the Malimg database which contains 25 malwarefamilies don’t have same or near number of images per class. To address theseissues, this paper proposes an effective malware classifier, based on costsensitive deep learning. When performing classification on imbalanced data, some classes get less accuracy than others. Cost-sensitive is meant to solve this issue, however in our case of 25 classes, classical cost-sensitive weights wasn’t effective is giving equal attention to all classes. The proposed approach improves the performance of malware classification, and we demonstrate this improvement using two Convolutional Neural Network models using functional and subclassing programming techniques, based on loss, accuracy, recall and precision.\",\"PeriodicalId\":52221,\"journal\":{\"name\":\"IAES International Journal of Artificial Intelligence\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IAES International Journal of Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11591/ijai.v12.i4.pp1836-1844\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Decision Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IAES International Journal of Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/ijai.v12.i4.pp1836-1844","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Decision Sciences","Score":null,"Total":0}
Improve malware classifiers performance using cost-sensitive learning for imbalanced dataset
In recent times, malware visualization has become very popular for malwareclassification in cybersecurity. Existing malware features can easily identifyknown malware that have been already detected, but they cannot identify newand infrequent malwares accurately. Moreover, deep learning algorithmsshow their power in term of malware classification topic. However, we foundthe use of imbalanced data; the Malimg database which contains 25 malwarefamilies don’t have same or near number of images per class. To address theseissues, this paper proposes an effective malware classifier, based on costsensitive deep learning. When performing classification on imbalanced data, some classes get less accuracy than others. Cost-sensitive is meant to solve this issue, however in our case of 25 classes, classical cost-sensitive weights wasn’t effective is giving equal attention to all classes. The proposed approach improves the performance of malware classification, and we demonstrate this improvement using two Convolutional Neural Network models using functional and subclassing programming techniques, based on loss, accuracy, recall and precision.