GLDOC: detection of implicitly malicious MS-Office documents using graph convolutional networks

IF 3.9 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Wenbo Wang, Peng Yi, Taotao Kou, Weitao Han, Chengyu Wang
{"title":"GLDOC: detection of implicitly malicious MS-Office documents using graph convolutional networks","authors":"Wenbo Wang, Peng Yi, Taotao Kou, Weitao Han, Chengyu Wang","doi":"10.1186/s42400-024-00243-7","DOIUrl":null,"url":null,"abstract":"<p>Nowadays, the malicious MS-Office document has already become one of the most effective attacking vectors in APT attacks. Though many protection mechanisms are provided, they have been proved easy to bypass, and the existed detection methods show poor performance when facing malicious documents with unknown vulnerabilities or with few malicious behaviors. In this paper, we first introduce the definition of im-documents, to describe those vulnerable documents which show implicitly malicious behaviors and escape most of public antivirus engines. Then we present GLDOC—a GCN based framework that is aimed at effectively detecting im-documents with dynamic analysis, and improving the possible blind spots of past detection methods. Besides the system call which is the only focus in most researches, we capture all dynamic behaviors in sandbox, take the process tree into consideration and reconstruct both of them into graphs. Using each line to learn each graph, GLDOC trains a 2-channel network as well as a classifier to formulate the malicious document detection problem into a graph learning and classification problem. Experiments show that GLDOC has a comprehensive balance of accuracy rate and false alarm rate − 95.33% and 4.33% respectively, outperforming other detection methods. When further testing in a simulated 5-day attacking scenario, our proposed framework still maintains a stable and high detection accuracy on the unknown vulnerabilities.</p>","PeriodicalId":36402,"journal":{"name":"Cybersecurity","volume":"50 1","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cybersecurity","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s42400-024-00243-7","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Nowadays, the malicious MS-Office document has already become one of the most effective attacking vectors in APT attacks. Though many protection mechanisms are provided, they have been proved easy to bypass, and the existed detection methods show poor performance when facing malicious documents with unknown vulnerabilities or with few malicious behaviors. In this paper, we first introduce the definition of im-documents, to describe those vulnerable documents which show implicitly malicious behaviors and escape most of public antivirus engines. Then we present GLDOC—a GCN based framework that is aimed at effectively detecting im-documents with dynamic analysis, and improving the possible blind spots of past detection methods. Besides the system call which is the only focus in most researches, we capture all dynamic behaviors in sandbox, take the process tree into consideration and reconstruct both of them into graphs. Using each line to learn each graph, GLDOC trains a 2-channel network as well as a classifier to formulate the malicious document detection problem into a graph learning and classification problem. Experiments show that GLDOC has a comprehensive balance of accuracy rate and false alarm rate − 95.33% and 4.33% respectively, outperforming other detection methods. When further testing in a simulated 5-day attacking scenario, our proposed framework still maintains a stable and high detection accuracy on the unknown vulnerabilities.

Abstract Image

GLDOC:利用图卷积网络检测隐含恶意的 MS-Office 文档
如今,恶意 MS-Office 文档已成为 APT 攻击中最有效的攻击载体之一。尽管提供了许多保护机制,但事实证明这些机制很容易被绕过,而且现有的检测方法在面对漏洞未知或恶意行为较少的恶意文档时表现不佳。在本文中,我们首先介绍了 "im-documents "的定义,以描述那些隐含恶意行为并能躲过大多数公共杀毒引擎的易受攻击文档。然后,我们介绍了 GLDOC--一个基于 GCN 的框架,旨在通过动态分析有效检测 im-文档,并改善以往检测方法可能存在的盲点。除了大多数研究中唯一关注的系统调用外,我们还捕获了沙箱中的所有动态行为,并将进程树考虑在内,将二者重构为图。GLDOC 利用每一行来学习每一个图,训练双通道网络和分类器,从而将恶意文档检测问题表述为一个图学习和分类问题。实验表明,GLDOC 在准确率和误报率方面取得了全面的平衡--准确率和误报率分别为 95.33% 和 4.33%,优于其他检测方法。当进一步在模拟的 5 天攻击场景中进行测试时,我们提出的框架对未知漏洞仍能保持稳定和较高的检测准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Cybersecurity
Cybersecurity Computer Science-Information Systems
CiteScore
7.30
自引率
0.00%
发文量
77
审稿时长
9 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信