VeriBin: A Malware Authorship Verification Approach for APT Tracking through Explainable and Functionality-Debiasing Adversarial Representation Learning

IF 3 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Privacy and Security Pub Date : 2024-07-20 DOI:10.1145/3669901

Weihan Ou, Steven H. H. Ding, Mohammad Zulkernine, Li Tao Li, Sarah Labrosse

{"title":"VeriBin: A Malware Authorship Verification Approach for APT Tracking through Explainable and Functionality-Debiasing Adversarial Representation Learning","authors":"Weihan Ou, Steven H. H. Ding, Mohammad Zulkernine, Li Tao Li, Sarah Labrosse","doi":"10.1145/3669901","DOIUrl":null,"url":null,"abstract":"Malware attacks are posing a significant threat to national security, cooperate network and public endpoint security. Identifying the Advanced Persistent Threat (APT) groups behind the attacks and grouping their activities into attack campaigns help security investigators trace their activities thus providing better security protections against future attacks. Existing Cyber Threat Intelligent (CTI) components mainly focus on malware family identification and behaviour characterization, which cannot solve the APT tracking problem: while APT tracking needs one to link malware binaries of multiple families to a single threat actor, these behavior or function-based techniques are tightened up to a specific attack technique and would fail on connecting different families. Binary Authorship Attribution (AA) solutions could discriminate against threat actors based on their stylometric traits. However, AA solutions assume that the author of a binary is within a fixed candidate author set. However, real-world malware binaries may be created by a new unknown threat actor.\n To address this research gap, we propose VeriBin for the Binary Authorship Verification (BAV) problem. VeriBin is a novel adversarial neural network that extracts functionality-agnostic style representations from assembly code for the AV task. The extracted style representations can be visualized and are explainable with VeriBin’s multi-head attention mechanism. We benchmark VeriBin with state-of-the-art coding style representations on a standard dataset and a recent malware-APT dataset. Given two anonymous binaries of out-of-sample authors, VeriBin can accurately determine whether they belong to the same author or not. VeriBin is resilient to compiler optimizations and robust against malware family variants.","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":null,"pages":null},"PeriodicalIF":3.0000,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Privacy and Security","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3669901","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Malware attacks are posing a significant threat to national security, cooperate network and public endpoint security. Identifying the Advanced Persistent Threat (APT) groups behind the attacks and grouping their activities into attack campaigns help security investigators trace their activities thus providing better security protections against future attacks. Existing Cyber Threat Intelligent (CTI) components mainly focus on malware family identification and behaviour characterization, which cannot solve the APT tracking problem: while APT tracking needs one to link malware binaries of multiple families to a single threat actor, these behavior or function-based techniques are tightened up to a specific attack technique and would fail on connecting different families. Binary Authorship Attribution (AA) solutions could discriminate against threat actors based on their stylometric traits. However, AA solutions assume that the author of a binary is within a fixed candidate author set. However, real-world malware binaries may be created by a new unknown threat actor. To address this research gap, we propose VeriBin for the Binary Authorship Verification (BAV) problem. VeriBin is a novel adversarial neural network that extracts functionality-agnostic style representations from assembly code for the AV task. The extracted style representations can be visualized and are explainable with VeriBin’s multi-head attention mechanism. We benchmark VeriBin with state-of-the-art coding style representations on a standard dataset and a recent malware-APT dataset. Given two anonymous binaries of out-of-sample authors, VeriBin can accurately determine whether they belong to the same author or not. VeriBin is resilient to compiler optimizations and robust against malware family variants.

查看原文本刊更多论文

VeriBin：通过可解释和功能性去伪存真的对抗性表征学习来追踪 APT 的恶意软件作者身份验证方法

恶意软件攻击正在对国家安全、合作网络和公共端点安全构成重大威胁。识别攻击背后的高级持续性威胁（APT）组织，并将其活动归类为攻击活动，有助于安全调查人员追踪其活动，从而为未来的攻击提供更好的安全保护。现有的网络威胁智能（CTI）组件主要侧重于恶意软件家族识别和行为特征描述，无法解决 APT 跟踪问题：虽然 APT 跟踪需要将多个家族的恶意软件二进制文件与单个威胁行为者联系起来，但这些基于行为或功能的技术仅限于特定的攻击技术，无法将不同的家族联系起来。二进制作者归属（AA）解决方案可根据威胁行为者的风格特征对其进行区分。不过，二进制作者归属解决方案假定二进制的作者是固定的候选作者集。然而，现实世界中的恶意软件二进制文件可能是由新的未知威胁行为者创建的。为了解决这一研究空白，我们针对二进制作者身份验证（BAV）问题提出了 VeriBin。VeriBin 是一种新型对抗神经网络，可从汇编代码中提取与功能无关的样式表示，用于反病毒任务。通过 VeriBin 的多头注意力机制，提取的风格表征可视化并可解释。我们在一个标准数据集和一个最新的恶意软件-APT 数据集上用最先进的编码风格表示法对 VeriBin 进行了基准测试。对于样本外作者的两个匿名二进制文件，VeriBin 可以准确判断它们是否属于同一作者。VeriBin 对编译器优化有很好的适应性，对恶意软件家族的变种也很强大。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Privacy and Security Computer Science-General Computer Science

CiteScore

5.20

自引率

0.00%

发文量

期刊介绍： ACM Transactions on Privacy and Security (TOPS) (formerly known as TISSEC) publishes high-quality research results in the fields of information and system security and privacy. Studies addressing all aspects of these fields are welcomed, ranging from technologies, to systems and applications, to the crafting of policies.