支持Android恶意软件识别和多态进化的图表示学习框架

2023 10th IEEE Swiss Conference on Data Science (SDS) Pub Date : 2023-06-01 DOI:10.1109/SDS57534.2023.00012

A. Cuzzocrea, Miguel Quebrado, Abderraouf Hafsaoui, Edoardo Serra

{"title":"支持Android恶意软件识别和多态进化的图表示学习框架","authors":"A. Cuzzocrea, Miguel Quebrado, Abderraouf Hafsaoui, Edoardo Serra","doi":"10.1109/SDS57534.2023.00012","DOIUrl":null,"url":null,"abstract":"Detecting Malware is an interesting research area, however, as the polymorphic nature of the latter makes it difficult to identify, particularly when using Hash-based detection methods. Unlike image-based strategies, in this research, a graph-based technique was used to extract control flow graphs from Android APK binaries. In order to handle the generated graph, we employ an approach that combines a novel graph representation learning method called Inferential SIR- GN for Graph representation, which retains graph structural similarities, with XGBoost, i.e., a typical Machine Learning model. The approach is then applied to MALNET, a publicly accessible cybersecurity database that contains the image and graph-based Android APK binary representations for a total of 1, 262, 024 million Android APK binary files with 47 kinds and 696 families. The experimental results indicate that our graph-based strategy outperforms the image-based approach in terms of detection accuracy.","PeriodicalId":150544,"journal":{"name":"2023 10th IEEE Swiss Conference on Data Science (SDS)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Graph-Representation-Learning Framework for Supporting Android Malware Identification and Polymorphic Evolution\",\"authors\":\"A. Cuzzocrea, Miguel Quebrado, Abderraouf Hafsaoui, Edoardo Serra\",\"doi\":\"10.1109/SDS57534.2023.00012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Detecting Malware is an interesting research area, however, as the polymorphic nature of the latter makes it difficult to identify, particularly when using Hash-based detection methods. Unlike image-based strategies, in this research, a graph-based technique was used to extract control flow graphs from Android APK binaries. In order to handle the generated graph, we employ an approach that combines a novel graph representation learning method called Inferential SIR- GN for Graph representation, which retains graph structural similarities, with XGBoost, i.e., a typical Machine Learning model. The approach is then applied to MALNET, a publicly accessible cybersecurity database that contains the image and graph-based Android APK binary representations for a total of 1, 262, 024 million Android APK binary files with 47 kinds and 696 families. The experimental results indicate that our graph-based strategy outperforms the image-based approach in terms of detection accuracy.\",\"PeriodicalId\":150544,\"journal\":{\"name\":\"2023 10th IEEE Swiss Conference on Data Science (SDS)\",\"volume\":\"96 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 10th IEEE Swiss Conference on Data Science (SDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SDS57534.2023.00012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 10th IEEE Swiss Conference on Data Science (SDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SDS57534.2023.00012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

然而，检测恶意软件是一个有趣的研究领域，因为后者的多态特性使其难以识别，特别是在使用基于哈希的检测方法时。与基于图像的策略不同，本研究使用基于图的技术从Android APK二进制文件中提取控制流图。为了处理生成的图，我们采用了一种方法，该方法结合了一种新的图表示学习方法，称为用于图表示的推理SIR- GN，它保留了图结构的相似性，与XGBoost(即典型的机器学习模型)。然后将该方法应用于MALNET，这是一个可公开访问的网络安全数据库，其中包含基于图像和图形的Android APK二进制表示，共有1,262,024万个Android APK二进制文件，包括47种和696个家族。实验结果表明，基于图的方法在检测精度方面优于基于图像的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Graph-Representation-Learning Framework for Supporting Android Malware Identification and Polymorphic Evolution

Detecting Malware is an interesting research area, however, as the polymorphic nature of the latter makes it difficult to identify, particularly when using Hash-based detection methods. Unlike image-based strategies, in this research, a graph-based technique was used to extract control flow graphs from Android APK binaries. In order to handle the generated graph, we employ an approach that combines a novel graph representation learning method called Inferential SIR- GN for Graph representation, which retains graph structural similarities, with XGBoost, i.e., a typical Machine Learning model. The approach is then applied to MALNET, a publicly accessible cybersecurity database that contains the image and graph-based Android APK binary representations for a total of 1, 262, 024 million Android APK binary files with 47 kinds and 696 families. The experimental results indicate that our graph-based strategy outperforms the image-based approach in terms of detection accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 10th IEEE Swiss Conference on Data Science (SDS)

自引率

0.00%

发文量