Shuangcheng Li , Zhangguo Tang , Huanzhou Li , Jian Zhang , Han Wang , Junfeng Wang
{"title":"GMADV:安卓恶意软件变体生成与分类对抗训练框架","authors":"Shuangcheng Li , Zhangguo Tang , Huanzhou Li , Jian Zhang , Han Wang , Junfeng Wang","doi":"10.1016/j.jisa.2024.103800","DOIUrl":null,"url":null,"abstract":"<div><p>Android malware uses anti-reverse analysis and APK shelling technology, which leads to the failure of the classification method based on decompiled features and the reduction of the classification accuracy based on single file features. Moreover, the lack of samples in some families of Android malware makes the classification model based on sample learning ineffective. To solve the above problems, this paper proposes a two-layer general framework for Android malware classification and adversarial training named GMADV, which enhances classifier performance through adversarial training. In the sample classification layer, based on the transformation method of the Markov model, it is proposed for the first time to convert the three files in the APK into RGB Markov images, and use VGG13 to automatically extract features and classification; In the variant amplification layer, the idea of \"regression for generation\" is firstly proposed, and GMM-GAN based on Gaussian process is designed to amplify the diversity of samples within the family. The experimental results show that RGB Markov images have better classification performance than grayscale images. On the three datasets, the classification effect after amplification has been improved to varying degrees, and all F1_Score reaches 95 %. Compared with other methods, GMADV has stronger family sample amplification ability and greater adversarial intensity.</p></div>","PeriodicalId":48638,"journal":{"name":"Journal of Information Security and Applications","volume":"84 ","pages":"Article 103800"},"PeriodicalIF":3.8000,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GMADV: An android malware variant generation and classification adversarial training framework\",\"authors\":\"Shuangcheng Li , Zhangguo Tang , Huanzhou Li , Jian Zhang , Han Wang , Junfeng Wang\",\"doi\":\"10.1016/j.jisa.2024.103800\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Android malware uses anti-reverse analysis and APK shelling technology, which leads to the failure of the classification method based on decompiled features and the reduction of the classification accuracy based on single file features. Moreover, the lack of samples in some families of Android malware makes the classification model based on sample learning ineffective. To solve the above problems, this paper proposes a two-layer general framework for Android malware classification and adversarial training named GMADV, which enhances classifier performance through adversarial training. In the sample classification layer, based on the transformation method of the Markov model, it is proposed for the first time to convert the three files in the APK into RGB Markov images, and use VGG13 to automatically extract features and classification; In the variant amplification layer, the idea of \\\"regression for generation\\\" is firstly proposed, and GMM-GAN based on Gaussian process is designed to amplify the diversity of samples within the family. The experimental results show that RGB Markov images have better classification performance than grayscale images. On the three datasets, the classification effect after amplification has been improved to varying degrees, and all F1_Score reaches 95 %. Compared with other methods, GMADV has stronger family sample amplification ability and greater adversarial intensity.</p></div>\",\"PeriodicalId\":48638,\"journal\":{\"name\":\"Journal of Information Security and Applications\",\"volume\":\"84 \",\"pages\":\"Article 103800\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2024-06-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Information Security and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2214212624001030\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Security and Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214212624001030","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
GMADV: An android malware variant generation and classification adversarial training framework
Android malware uses anti-reverse analysis and APK shelling technology, which leads to the failure of the classification method based on decompiled features and the reduction of the classification accuracy based on single file features. Moreover, the lack of samples in some families of Android malware makes the classification model based on sample learning ineffective. To solve the above problems, this paper proposes a two-layer general framework for Android malware classification and adversarial training named GMADV, which enhances classifier performance through adversarial training. In the sample classification layer, based on the transformation method of the Markov model, it is proposed for the first time to convert the three files in the APK into RGB Markov images, and use VGG13 to automatically extract features and classification; In the variant amplification layer, the idea of "regression for generation" is firstly proposed, and GMM-GAN based on Gaussian process is designed to amplify the diversity of samples within the family. The experimental results show that RGB Markov images have better classification performance than grayscale images. On the three datasets, the classification effect after amplification has been improved to varying degrees, and all F1_Score reaches 95 %. Compared with other methods, GMADV has stronger family sample amplification ability and greater adversarial intensity.
期刊介绍:
Journal of Information Security and Applications (JISA) focuses on the original research and practice-driven applications with relevance to information security and applications. JISA provides a common linkage between a vibrant scientific and research community and industry professionals by offering a clear view on modern problems and challenges in information security, as well as identifying promising scientific and "best-practice" solutions. JISA issues offer a balance between original research work and innovative industrial approaches by internationally renowned information security experts and researchers.