Rajchada Chanajitt, B. Pfahringer, Heitor Murilo Gomes
{"title":"Combining Static and Dynamic Analysis to Improve Machine Learning-based Malware Classification","authors":"Rajchada Chanajitt, B. Pfahringer, Heitor Murilo Gomes","doi":"10.1109/DSAA53316.2021.9564144","DOIUrl":null,"url":null,"abstract":"Windows Portable Executable files can be malformed for malicious purposes. There are many ways and tricks to circumvent standard security detection and protection measures. For example, one can bypass Windows Defender Firewall by creating a writable file in a user's temporary folder whose filename look like a legitimate process (e.g. svchost.exe, chrome32.exe, and dllhost32.exe) and executing them without user intervention. In this work, we leverage static properties and dynamic behaviour analysis for malware classification. For dynamic analysis, information is retrieved from the Falcon Sandbox malware website. On top of that, we also run malware in a virtualised Windows 10 environment to analyse memory dumps and generate even more features that may capture potential malicious behaviour. Three different classifiers are analysed in our empirical experiments: random forests, gradient boosting, and neural networks. The combination of static and dynamic features consistently yields a higher F1-score for every model compared to the same model trained using only static or dynamic features. The best models achieve F1-scores of up to 98.9%.","PeriodicalId":129612,"journal":{"name":"2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSAA53316.2021.9564144","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Windows Portable Executable files can be malformed for malicious purposes. There are many ways and tricks to circumvent standard security detection and protection measures. For example, one can bypass Windows Defender Firewall by creating a writable file in a user's temporary folder whose filename look like a legitimate process (e.g. svchost.exe, chrome32.exe, and dllhost32.exe) and executing them without user intervention. In this work, we leverage static properties and dynamic behaviour analysis for malware classification. For dynamic analysis, information is retrieved from the Falcon Sandbox malware website. On top of that, we also run malware in a virtualised Windows 10 environment to analyse memory dumps and generate even more features that may capture potential malicious behaviour. Three different classifiers are analysed in our empirical experiments: random forests, gradient boosting, and neural networks. The combination of static and dynamic features consistently yields a higher F1-score for every model compared to the same model trained using only static or dynamic features. The best models achieve F1-scores of up to 98.9%.