Kien Hoang Dang, Dai Tho Nguyen, Thu Hien Nguyen Thi
{"title":"开放世界恶意软件分类的高效增量实例学习算法","authors":"Kien Hoang Dang, Dai Tho Nguyen, Thu Hien Nguyen Thi","doi":"10.1109/atc52653.2021.9598272","DOIUrl":null,"url":null,"abstract":"Malware is growing rapidly in number and become more and more sophisticated. To prevent them we need to collect samples continuously and update them to the classifier. In this paper, we will propose a method to update new labeled samples of malware to the classifier easily without re-training everything. The classifier can be updated by both labeled malware samples of an existing class or a new class. Our method also has the ability to detect samples of unknown families. Experiments are performed over the traditional computer malware dataset and the IoT malware dataset. The results have shown that our method can reach the macro F1-score almost the same re-train everything but take significantly less time.","PeriodicalId":196900,"journal":{"name":"2021 International Conference on Advanced Technologies for Communications (ATC)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Incremental Instance-based Learning Algorithms for Open World Malware Classification\",\"authors\":\"Kien Hoang Dang, Dai Tho Nguyen, Thu Hien Nguyen Thi\",\"doi\":\"10.1109/atc52653.2021.9598272\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Malware is growing rapidly in number and become more and more sophisticated. To prevent them we need to collect samples continuously and update them to the classifier. In this paper, we will propose a method to update new labeled samples of malware to the classifier easily without re-training everything. The classifier can be updated by both labeled malware samples of an existing class or a new class. Our method also has the ability to detect samples of unknown families. Experiments are performed over the traditional computer malware dataset and the IoT malware dataset. The results have shown that our method can reach the macro F1-score almost the same re-train everything but take significantly less time.\",\"PeriodicalId\":196900,\"journal\":{\"name\":\"2021 International Conference on Advanced Technologies for Communications (ATC)\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Advanced Technologies for Communications (ATC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/atc52653.2021.9598272\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Advanced Technologies for Communications (ATC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/atc52653.2021.9598272","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficient Incremental Instance-based Learning Algorithms for Open World Malware Classification
Malware is growing rapidly in number and become more and more sophisticated. To prevent them we need to collect samples continuously and update them to the classifier. In this paper, we will propose a method to update new labeled samples of malware to the classifier easily without re-training everything. The classifier can be updated by both labeled malware samples of an existing class or a new class. Our method also has the ability to detect samples of unknown families. Experiments are performed over the traditional computer malware dataset and the IoT malware dataset. The results have shown that our method can reach the macro F1-score almost the same re-train everything but take significantly less time.