Meta-Learning for Multi-Family Android Malware Classification

IF 6.6 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING
Yao Li, Dawei Yuan, Tao Zhang, Haipeng Cai, David Lo, Cuiyun Gao, Xiapu Luo, He Jiang
{"title":"Meta-Learning for Multi-Family Android Malware Classification","authors":"Yao Li, Dawei Yuan, Tao Zhang, Haipeng Cai, David Lo, Cuiyun Gao, Xiapu Luo, He Jiang","doi":"10.1145/3664806","DOIUrl":null,"url":null,"abstract":"<p>With the emergence of smartphones, Android has become a widely used mobile operating system. However, it is vulnerable when encountering various types of attacks. Every day, new malware threatens the security of users’ devices and private data. Many methods have been proposed to classify malicious applications, utilizing static or dynamic analysis for classification. However, previous methods still suffer from unsatisfactory performance due to two challenges. First, they are unable to address the imbalanced data distribution problem, leading to poor performance for malware families with few members. Second, they are unable to address the zero-day malware (zero-day malware refers to malicious applications that exploit unknown vulnerabilities) classification problem. In this paper, we introduce an innovative <b>meta</b>-learning approach for <b>m</b>ulti-family <b>A</b>ndroid <b>m</b>alware <b>c</b>lassification named <b>Meta-MAMC</b>, which uses meta-learning technology to learn meta-knowledge (i.e. the similarities and differences among different malware families) of few-family samples and combines new sampling algorithms to solve the above challenges. <monospace>Meta-MAMC</monospace> integrates (i) the meta-knowledge contained within the dataset to guide models in learning to identify unknown malware, and (ii) more accurate and diverse tasks based on novel sampling strategies, as well as directly adapting meta-learning to a new few-sample and zero-sample task to classify families. We have evaluated <monospace>Meta-MAMC</monospace> on two popular datasets and a corpus of real-world Android applications. The results demonstrate its efficacy in accurately classifying malicious applications belonging to certain malware families, even achieving 100% classification in some families.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"44 1","pages":""},"PeriodicalIF":6.6000,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Software Engineering and Methodology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3664806","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

With the emergence of smartphones, Android has become a widely used mobile operating system. However, it is vulnerable when encountering various types of attacks. Every day, new malware threatens the security of users’ devices and private data. Many methods have been proposed to classify malicious applications, utilizing static or dynamic analysis for classification. However, previous methods still suffer from unsatisfactory performance due to two challenges. First, they are unable to address the imbalanced data distribution problem, leading to poor performance for malware families with few members. Second, they are unable to address the zero-day malware (zero-day malware refers to malicious applications that exploit unknown vulnerabilities) classification problem. In this paper, we introduce an innovative meta-learning approach for multi-family Android malware classification named Meta-MAMC, which uses meta-learning technology to learn meta-knowledge (i.e. the similarities and differences among different malware families) of few-family samples and combines new sampling algorithms to solve the above challenges. Meta-MAMC integrates (i) the meta-knowledge contained within the dataset to guide models in learning to identify unknown malware, and (ii) more accurate and diverse tasks based on novel sampling strategies, as well as directly adapting meta-learning to a new few-sample and zero-sample task to classify families. We have evaluated Meta-MAMC on two popular datasets and a corpus of real-world Android applications. The results demonstrate its efficacy in accurately classifying malicious applications belonging to certain malware families, even achieving 100% classification in some families.

用于多家族安卓恶意软件分类的元学习
随着智能手机的出现,安卓已成为一种广泛使用的移动操作系统。然而,它在遭遇各种类型的攻击时也很脆弱。每天都有新的恶意软件威胁着用户设备和私人数据的安全。人们提出了许多方法来对恶意应用程序进行分类,利用静态或动态分析进行分类。然而,以往的方法由于面临两个挑战,性能仍不尽如人意。首先,这些方法无法解决数据分布不平衡的问题,导致对于成员较少的恶意软件家族来说性能不佳。其次,它们无法解决零日恶意软件(零日恶意软件指利用未知漏洞的恶意应用程序)分类问题。本文介绍了一种用于多家族安卓恶意软件分类的创新元学习方法--Meta-MAMC,该方法利用元学习技术学习少家族样本的元知识(即不同恶意软件家族之间的异同),并结合新的采样算法来解决上述难题。Meta-MAMC 整合了:(1)数据集中包含的元知识,以指导模型学习识别未知恶意软件;(2)基于新型采样策略的更准确、更多样化的任务,以及直接将元学习适应于新的少样本和零样本任务,以对家族进行分类。我们在两个流行数据集和一个真实安卓应用语料库上对 Meta-MAMC 进行了评估。结果表明,Meta-MAMC 能够准确地对属于某些恶意软件家族的恶意应用程序进行分类,在某些家族中甚至达到了 100% 的分类率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology 工程技术-计算机:软件工程
CiteScore
6.30
自引率
4.50%
发文量
164
审稿时长
>12 weeks
期刊介绍: Designing and building a large, complex software system is a tremendous challenge. ACM Transactions on Software Engineering and Methodology (TOSEM) publishes papers on all aspects of that challenge: specification, design, development and maintenance. It covers tools and methodologies, languages, data structures, and algorithms. TOSEM also reports on successful efforts, noting practical lessons that can be scaled and transferred to other projects, and often looks at applications of innovative technologies. The tone is scholarly but readable; the content is worthy of study; the presentation is effective.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信