Power-ASTNN: A deobfuscation and AST neural network enabled effective detection method for malicious PowerShell Scripts

IF 4.8 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Computers & Security Pub Date : 2025-03-22 DOI:10.1016/j.cose.2025.104441

Sanfeng Zhang , Shangze Li , Juncheng Lu , Wang Yang

{"title":"Power-ASTNN: A deobfuscation and AST neural network enabled effective detection method for malicious PowerShell Scripts","authors":"Sanfeng Zhang , Shangze Li , Juncheng Lu , Wang Yang","doi":"10.1016/j.cose.2025.104441","DOIUrl":null,"url":null,"abstract":"<div><div>PowerShell is frequently utilized by attackers in the realm of Windows system security, particularly in cyberattack activities such as information stealing, vulnerability exploitation, and password cracking. To evade detection, attackers often employ code obfuscation techniques on their scripts. Current detection solutions face challenges due to limited deobfuscation methods and a predominant focus on identifying static and local features. This limitation hinders the ability to capture fine-grained code features and long-distance semantic relationships, resulting in reduced robustness and accuracy. To address these issues, this paper presents a novel malicious script detection method, Power-ASTNN, which integrates deobfuscation and a tree neural network. Initially, the method utilizes AMSI memory dump to deobfuscate PowerShell scripts, yielding fully deobfuscated samples. Subsequently, a subtree splitting algorithm tailored for abstract syntax trees extracts fine-grained code features from subtree fragments. Finally, a two-layer neural network model encodes representations based on subtree node semantics and sequence semantics, effectively capturing the semantic characteristics of the code. Experimental results demonstrate the effectiveness of Power-ASTNN, achieving an accuracy of 98.87% on a self built dataset collected from multiple publicly available sources, while maintaining a low false negative rate and a high area under the curve (AUC) value exceeding 0.995. Furthermore, Power-ASTNN demonstrates superior detection performance against adversarial samples compared with existing detection models.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"154 ","pages":"Article 104441"},"PeriodicalIF":4.8000,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404825001300","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

PowerShell is frequently utilized by attackers in the realm of Windows system security, particularly in cyberattack activities such as information stealing, vulnerability exploitation, and password cracking. To evade detection, attackers often employ code obfuscation techniques on their scripts. Current detection solutions face challenges due to limited deobfuscation methods and a predominant focus on identifying static and local features. This limitation hinders the ability to capture fine-grained code features and long-distance semantic relationships, resulting in reduced robustness and accuracy. To address these issues, this paper presents a novel malicious script detection method, Power-ASTNN, which integrates deobfuscation and a tree neural network. Initially, the method utilizes AMSI memory dump to deobfuscate PowerShell scripts, yielding fully deobfuscated samples. Subsequently, a subtree splitting algorithm tailored for abstract syntax trees extracts fine-grained code features from subtree fragments. Finally, a two-layer neural network model encodes representations based on subtree node semantics and sequence semantics, effectively capturing the semantic characteristics of the code. Experimental results demonstrate the effectiveness of Power-ASTNN, achieving an accuracy of 98.87% on a self built dataset collected from multiple publicly available sources, while maintaining a low false negative rate and a high area under the curve (AUC) value exceeding 0.995. Furthermore, Power-ASTNN demonstrates superior detection performance against adversarial samples compared with existing detection models.

查看原文本刊更多论文

Power-ASTNN：一种消除混淆和AST神经网络的有效检测方法，用于恶意PowerShell脚本

PowerShell经常被Windows系统安全领域的攻击者利用，特别是在信息窃取、漏洞利用和密码破解等网络攻击活动中。为了逃避检测，攻击者经常在他们的脚本上使用代码混淆技术。目前的检测解决方案面临着挑战，因为去混淆方法有限，主要集中在识别静态和局部特征上。这种限制阻碍了捕获细粒度代码特性和远距离语义关系的能力，从而降低了健壮性和准确性。为了解决这些问题，本文提出了一种新的恶意脚本检测方法Power-ASTNN，该方法将去混淆和树神经网络相结合。最初，该方法利用AMSI内存转储去混淆PowerShell脚本，生成完全去混淆的示例。随后，针对抽象语法树的子树分割算法从子树片段中提取细粒度的代码特征。最后，采用基于子树节点语义和序列语义的双层神经网络模型对表征进行编码，有效捕获编码的语义特征。实验结果证明了Power-ASTNN的有效性，在多个公开来源的自建数据集上实现了98.87%的准确率，同时保持了较低的假阴性率和较高的曲线下面积（AUC）值，超过0.995。此外，与现有的检测模型相比，Power-ASTNN对对抗样本的检测性能更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers & Security 工程技术-计算机：信息系统

CiteScore

12.40

自引率

7.10%

发文量

365

审稿时长

10.7 months

期刊介绍： Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world. Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.