Kai Yuan Tay, Shawn Chua, M. Chua, Vivek Balachandran
{"title":"Towards Robust Detection of PDF-based Malware","authors":"Kai Yuan Tay, Shawn Chua, M. Chua, Vivek Balachandran","doi":"10.1145/3508398.3519365","DOIUrl":null,"url":null,"abstract":"With the indisputable prevalence of PDFs, several studies into PDF malware and their evasive variants have been conducted to test the robustness of ML-based PDF classifier frameworks, Hidost and Mimicus. As heavily documented, the fundamental difference between them is that Hidost investigates the logical structure of PDFs, while Mimicus detects malicious indicators through their structural features. However, there exists techniques to mutate such features such that malicious PDFs are able to bypass these classifiers. In this work, we investigated three known attacks: Mimicry, Mimicry+, and Reverse Mimicry to compare how effective they are in evading classifiers in Hidost and Mimicus. The results shows that Mimicry and Mimicry+ are effective in bypassing models in Mimicus but not in Hidost, while Reverse Mimicy is effective against both models in Mimicus and Hidost.","PeriodicalId":102306,"journal":{"name":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3508398.3519365","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the indisputable prevalence of PDFs, several studies into PDF malware and their evasive variants have been conducted to test the robustness of ML-based PDF classifier frameworks, Hidost and Mimicus. As heavily documented, the fundamental difference between them is that Hidost investigates the logical structure of PDFs, while Mimicus detects malicious indicators through their structural features. However, there exists techniques to mutate such features such that malicious PDFs are able to bypass these classifiers. In this work, we investigated three known attacks: Mimicry, Mimicry+, and Reverse Mimicry to compare how effective they are in evading classifiers in Hidost and Mimicus. The results shows that Mimicry and Mimicry+ are effective in bypassing models in Mimicus but not in Hidost, while Reverse Mimicy is effective against both models in Mimicus and Hidost.