Advanced Windows Methods on Malware Detection and Classification

Dima Rabadi, S. Teo
{"title":"Advanced Windows Methods on Malware Detection and Classification","authors":"Dima Rabadi, S. Teo","doi":"10.1145/3427228.3427242","DOIUrl":null,"url":null,"abstract":"Application Programming Interfaces (APIs) are still considered the standard accessible data source and core wok of the most widely adopted malware detection and classification techniques. API-based malware detectors highly rely on measuring API’s statistical features, such as calculating the frequency counter of calling specific API calls or finding their malicious sequence pattern (i.e., signature-based detectors). Using simple hooking tools, malware authors would help in failing such detectors by interrupting the sequence and shuffling the API calls or deleting/inserting irrelevant calls (i.e., changing the frequency counter). Moreover, relying on API calls (e.g., function names) alone without taking into account their function parameters is insufficient to understand the purpose of the program. For example, the same API call (e.g., writing on a file) would act in two ways if two different arguments are passed (e.g., writing on a system versus user file). However, because of the heterogeneous nature of API arguments, most of the available API-based malicious behavior detectors would consider only the API calls without taking into account their argument information (e.g., function parameters). Alternatively, other detectors try considering the API arguments in their techniques, but they acquire having proficient knowledge about the API arguments or powerful processors to extract them. Such requirements demand a prohibitive cost and complex operations to deal with the arguments. To overcome the above limitations, with the help of machine learning and without any expert knowledge of the arguments, we propose a light-weight API-based dynamic feature extraction technique, and we use it to implement a malware detection and type classification approach. To evaluate our approach, we use reasonable datasets of 7774 benign and 7105 malicious samples belonging to ten distinct malware types. Experimental results show that our type classification module could achieve an accuracy of , where our malware detection module could reach an accuracy of over , and outperforms many state-of-the-art API-based malware detectors.","PeriodicalId":175869,"journal":{"name":"Annual Computer Security Applications Conference","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Computer Security Applications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3427228.3427242","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

Application Programming Interfaces (APIs) are still considered the standard accessible data source and core wok of the most widely adopted malware detection and classification techniques. API-based malware detectors highly rely on measuring API’s statistical features, such as calculating the frequency counter of calling specific API calls or finding their malicious sequence pattern (i.e., signature-based detectors). Using simple hooking tools, malware authors would help in failing such detectors by interrupting the sequence and shuffling the API calls or deleting/inserting irrelevant calls (i.e., changing the frequency counter). Moreover, relying on API calls (e.g., function names) alone without taking into account their function parameters is insufficient to understand the purpose of the program. For example, the same API call (e.g., writing on a file) would act in two ways if two different arguments are passed (e.g., writing on a system versus user file). However, because of the heterogeneous nature of API arguments, most of the available API-based malicious behavior detectors would consider only the API calls without taking into account their argument information (e.g., function parameters). Alternatively, other detectors try considering the API arguments in their techniques, but they acquire having proficient knowledge about the API arguments or powerful processors to extract them. Such requirements demand a prohibitive cost and complex operations to deal with the arguments. To overcome the above limitations, with the help of machine learning and without any expert knowledge of the arguments, we propose a light-weight API-based dynamic feature extraction technique, and we use it to implement a malware detection and type classification approach. To evaluate our approach, we use reasonable datasets of 7774 benign and 7105 malicious samples belonging to ten distinct malware types. Experimental results show that our type classification module could achieve an accuracy of , where our malware detection module could reach an accuracy of over , and outperforms many state-of-the-art API-based malware detectors.
高级Windows恶意软件检测和分类方法
应用程序编程接口(api)仍然被认为是标准的可访问数据源和最广泛采用的恶意软件检测和分类技术的核心工作。基于API的恶意软件检测器高度依赖于测量API的统计特征,例如计算调用特定API调用的频率计数器或查找其恶意序列模式(即基于签名的检测器)。使用简单的钩子工具,恶意软件作者将通过中断序列和打乱API调用或删除/插入无关调用(即改变频率计数器)来帮助失败此类检测器。此外,仅仅依靠API调用(例如,函数名)而不考虑它们的函数参数是不足以理解程序的目的的。例如,如果传递两个不同的参数(例如,在系统文件和用户文件上写入),同一个API调用(例如,在文件上写入)将以两种方式运行。然而,由于API参数的异构性质,大多数可用的基于API的恶意行为检测器将只考虑API调用,而不考虑其参数信息(例如,函数参数)。或者,其他检测器尝试在它们的技术中考虑API参数,但它们需要对API参数或强大的处理器具有精通的知识来提取它们。这样的需求需要高昂的成本和复杂的操作来处理参数。为了克服上述局限性,在机器学习的帮助下,在没有任何专家知识的情况下,我们提出了一种基于轻量级api的动态特征提取技术,并使用它来实现恶意软件检测和类型分类方法。为了评估我们的方法,我们使用了7774个良性和7105个恶意样本的合理数据集,这些样本属于10种不同的恶意软件类型。实验结果表明,我们的类型分类模块可以达到的准确率为,而我们的恶意软件检测模块可以达到的准确率为,并且优于许多最先进的基于api的恶意软件检测器。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信