Accurate Adware Detection Using Opcode Sequence Extraction

R. Shahzad, Niklas Lavesson, H. Johnson
{"title":"Accurate Adware Detection Using Opcode Sequence Extraction","authors":"R. Shahzad, Niklas Lavesson, H. Johnson","doi":"10.1109/ARES.2011.35","DOIUrl":null,"url":null,"abstract":"Adware represents a possible threat to the security and privacy of computer users. Traditional signature-based and heuristic-based methods have not been proven to be successful at detecting this type of software. This paper presents an adware detection approach based on the application of data mining on disassembled code. The main contributions of the paper is a large publicly available adware data set, an accurate adware detection algorithm, and an extensive empirical evaluation of several candidate machine learning techniques that can be used in conjunction with the algorithm. We have extracted sequences of opcodes from adware and benign software and we have then applied feature selection, using different configurations, to obtain 63 data sets. Six data mining algorithms have been evaluated on these data sets in order to find an efficient and accurate detector. Our experimental results show that the proposed approach can be used to accurately detect both novel and known adware instances even though the binary difference between adware and legitimate software is usually small.","PeriodicalId":254443,"journal":{"name":"2011 Sixth International Conference on Availability, Reliability and Security","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Sixth International Conference on Availability, Reliability and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARES.2011.35","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 26

Abstract

Adware represents a possible threat to the security and privacy of computer users. Traditional signature-based and heuristic-based methods have not been proven to be successful at detecting this type of software. This paper presents an adware detection approach based on the application of data mining on disassembled code. The main contributions of the paper is a large publicly available adware data set, an accurate adware detection algorithm, and an extensive empirical evaluation of several candidate machine learning techniques that can be used in conjunction with the algorithm. We have extracted sequences of opcodes from adware and benign software and we have then applied feature selection, using different configurations, to obtain 63 data sets. Six data mining algorithms have been evaluated on these data sets in order to find an efficient and accurate detector. Our experimental results show that the proposed approach can be used to accurately detect both novel and known adware instances even though the binary difference between adware and legitimate software is usually small.
使用操作码序列提取准确的广告软件检测
广告软件可能对计算机用户的安全和隐私构成威胁。传统的基于签名和启发式的方法在检测此类软件方面尚未被证明是成功的。提出了一种基于数据挖掘在反汇编代码中的应用的广告软件检测方法。本文的主要贡献是一个大型的公开广告软件数据集,一个准确的广告软件检测算法,以及对几种候选机器学习技术的广泛经验评估,这些技术可以与该算法结合使用。我们从广告软件和良性软件中提取了操作码序列,然后应用特征选择,使用不同的配置,获得63个数据集。为了找到一种高效、准确的检测器,我们在这些数据集上对六种数据挖掘算法进行了评估。我们的实验结果表明,即使广告软件和合法软件之间的二进制差异通常很小,所提出的方法也可以用来准确地检测新的和已知的广告软件实例。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信