ByteFreq: Malware clustering using byte frequency

Nirmal Singh, S. S. Khurmi
{"title":"ByteFreq: Malware clustering using byte frequency","authors":"Nirmal Singh, S. S. Khurmi","doi":"10.1109/ICRITO.2016.7784976","DOIUrl":null,"url":null,"abstract":"Increased number of malware samples have created many challenges for Antivirus companies. One of these challenges is clustering the large number of malware samples they receive daily. Malware authors use malware generation kits to create different instances of the same malware. So most of these malicious samples are polymorphic instances of previously known malware family only. Clustering these large number of samples rapidly and accurately without spending much time on processing the sample have become a critical requirement. In this paper we proposed, implemented and evaluated a method, called ByteFreq that can cluster large number of samples using byte frequency. Byte frequency is represented as time series and SAX (Symbolic Aggregation approXimation)[1] is used to convert the time series in symbolic representation. We evaluated proposed system on real world malware samples and achieved 0.92 precision and 0.96 recall accuracy.","PeriodicalId":377611,"journal":{"name":"2016 5th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 5th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRITO.2016.7784976","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Increased number of malware samples have created many challenges for Antivirus companies. One of these challenges is clustering the large number of malware samples they receive daily. Malware authors use malware generation kits to create different instances of the same malware. So most of these malicious samples are polymorphic instances of previously known malware family only. Clustering these large number of samples rapidly and accurately without spending much time on processing the sample have become a critical requirement. In this paper we proposed, implemented and evaluated a method, called ByteFreq that can cluster large number of samples using byte frequency. Byte frequency is represented as time series and SAX (Symbolic Aggregation approXimation)[1] is used to convert the time series in symbolic representation. We evaluated proposed system on real world malware samples and achieved 0.92 precision and 0.96 recall accuracy.
ByteFreq:使用字节频率的恶意软件集群
恶意软件样本数量的增加给反病毒公司带来了许多挑战。其中一个挑战是对他们每天收到的大量恶意软件样本进行聚类。恶意软件作者使用恶意软件生成工具包来创建同一恶意软件的不同实例。因此,大多数恶意样本都是以前已知的恶意软件家族的多态实例。在不花费大量时间处理样本的情况下快速准确地聚类这些大量样本已成为一个关键要求。在本文中,我们提出,实现并评估了一种称为ByteFreq的方法,该方法可以使用字节频率对大量样本进行聚类。字节频率表示为时间序列,使用SAX (Symbolic Aggregation approXimation)[1]将时间序列转换为符号表示。我们在真实的恶意软件样本上对该系统进行了评估,准确率达到0.92,查全准确率为0.96。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信