A. Cuzzocrea, Miguel Quebrado, Abderraouf Hafsaoui, Edoardo Serra
{"title":"支持Android恶意软件识别和多态进化的图表示学习框架","authors":"A. Cuzzocrea, Miguel Quebrado, Abderraouf Hafsaoui, Edoardo Serra","doi":"10.1109/SDS57534.2023.00012","DOIUrl":null,"url":null,"abstract":"Detecting Malware is an interesting research area, however, as the polymorphic nature of the latter makes it difficult to identify, particularly when using Hash-based detection methods. Unlike image-based strategies, in this research, a graph-based technique was used to extract control flow graphs from Android APK binaries. In order to handle the generated graph, we employ an approach that combines a novel graph representation learning method called Inferential SIR- GN for Graph representation, which retains graph structural similarities, with XGBoost, i.e., a typical Machine Learning model. The approach is then applied to MALNET, a publicly accessible cybersecurity database that contains the image and graph-based Android APK binary representations for a total of 1, 262, 024 million Android APK binary files with 47 kinds and 696 families. The experimental results indicate that our graph-based strategy outperforms the image-based approach in terms of detection accuracy.","PeriodicalId":150544,"journal":{"name":"2023 10th IEEE Swiss Conference on Data Science (SDS)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Graph-Representation-Learning Framework for Supporting Android Malware Identification and Polymorphic Evolution\",\"authors\":\"A. Cuzzocrea, Miguel Quebrado, Abderraouf Hafsaoui, Edoardo Serra\",\"doi\":\"10.1109/SDS57534.2023.00012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Detecting Malware is an interesting research area, however, as the polymorphic nature of the latter makes it difficult to identify, particularly when using Hash-based detection methods. Unlike image-based strategies, in this research, a graph-based technique was used to extract control flow graphs from Android APK binaries. In order to handle the generated graph, we employ an approach that combines a novel graph representation learning method called Inferential SIR- GN for Graph representation, which retains graph structural similarities, with XGBoost, i.e., a typical Machine Learning model. The approach is then applied to MALNET, a publicly accessible cybersecurity database that contains the image and graph-based Android APK binary representations for a total of 1, 262, 024 million Android APK binary files with 47 kinds and 696 families. The experimental results indicate that our graph-based strategy outperforms the image-based approach in terms of detection accuracy.\",\"PeriodicalId\":150544,\"journal\":{\"name\":\"2023 10th IEEE Swiss Conference on Data Science (SDS)\",\"volume\":\"96 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 10th IEEE Swiss Conference on Data Science (SDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SDS57534.2023.00012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 10th IEEE Swiss Conference on Data Science (SDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SDS57534.2023.00012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Graph-Representation-Learning Framework for Supporting Android Malware Identification and Polymorphic Evolution
Detecting Malware is an interesting research area, however, as the polymorphic nature of the latter makes it difficult to identify, particularly when using Hash-based detection methods. Unlike image-based strategies, in this research, a graph-based technique was used to extract control flow graphs from Android APK binaries. In order to handle the generated graph, we employ an approach that combines a novel graph representation learning method called Inferential SIR- GN for Graph representation, which retains graph structural similarities, with XGBoost, i.e., a typical Machine Learning model. The approach is then applied to MALNET, a publicly accessible cybersecurity database that contains the image and graph-based Android APK binary representations for a total of 1, 262, 024 million Android APK binary files with 47 kinds and 696 families. The experimental results indicate that our graph-based strategy outperforms the image-based approach in terms of detection accuracy.