Text Mining for Malware Classification Using Multivariate All Repeated Patterns Detection

Konstantinos F. Xylogiannopoulos, P. Karampelas, R. Alhajj
{"title":"Text Mining for Malware Classification Using Multivariate All Repeated Patterns Detection","authors":"Konstantinos F. Xylogiannopoulos, P. Karampelas, R. Alhajj","doi":"10.1145/3341161.3350841","DOIUrl":null,"url":null,"abstract":"Mobile phones have become nowadays a commodity to the majority of people. Using them, people are able to access the world of Internet and connect with their friends, their colleagues at work or even unknown people with common interests. This proliferation of the mobile devices has also been seen as an opportunity for the cyber criminals to deceive smartphone users and steel their money directly or indirectly, respectively, by accessing their bank accounts through the smartphones or by blackmailing them or selling their private data such as photos, credit card data, etc. to third parties. This is usually achieved by installing malware to smartphones masking their malevolent payload as a legitimate application and advertise it to the users with the hope that mobile users will install it in their devices. Thus, any existing application can easily be modified by integrating a malware and then presented it as a legitimate one. In response to this, scientists have proposed a number of malware detection and classification methods using a variety of techniques. Even though, several of them achieve relatively high precision in malware classification, there is still space for improvement. In this paper, we propose a text mining all repeated pattern detection method which uses the decompiled files of an application in order to classify a suspicious application into one of the known malware families. Based on the experimental results using a real malware dataset, the methodology tries to correctly classify (without any misclassification) all randomly selected malware applications of 3 categories with 3 different families each.","PeriodicalId":403360,"journal":{"name":"2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"195 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3341161.3350841","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Mobile phones have become nowadays a commodity to the majority of people. Using them, people are able to access the world of Internet and connect with their friends, their colleagues at work or even unknown people with common interests. This proliferation of the mobile devices has also been seen as an opportunity for the cyber criminals to deceive smartphone users and steel their money directly or indirectly, respectively, by accessing their bank accounts through the smartphones or by blackmailing them or selling their private data such as photos, credit card data, etc. to third parties. This is usually achieved by installing malware to smartphones masking their malevolent payload as a legitimate application and advertise it to the users with the hope that mobile users will install it in their devices. Thus, any existing application can easily be modified by integrating a malware and then presented it as a legitimate one. In response to this, scientists have proposed a number of malware detection and classification methods using a variety of techniques. Even though, several of them achieve relatively high precision in malware classification, there is still space for improvement. In this paper, we propose a text mining all repeated pattern detection method which uses the decompiled files of an application in order to classify a suspicious application into one of the known malware families. Based on the experimental results using a real malware dataset, the methodology tries to correctly classify (without any misclassification) all randomly selected malware applications of 3 categories with 3 different families each.
基于多元全重复模式检测的文本挖掘恶意软件分类
手机如今已成为大多数人的一种商品。使用它们,人们能够访问互联网的世界,并与他们的朋友,他们的同事在工作,甚至不认识的人有共同的兴趣。移动设备的激增也被视为网络犯罪分子欺骗智能手机用户并直接或间接地分别通过智能手机访问他们的银行账户或勒索他们或将他们的私人数据(如照片,信用卡数据等)出售给第三方的机会。这通常是通过将恶意软件安装到智能手机上,将其恶意负载伪装成合法应用程序,并向用户宣传,希望移动用户将其安装到他们的设备中。因此,任何现有的应用程序都可以很容易地通过集成恶意软件进行修改,然后将其呈现为合法的应用程序。针对这一点,科学家们提出了一些使用各种技术的恶意软件检测和分类方法。尽管其中有几个在恶意软件分类上达到了较高的精度,但仍有改进的空间。本文提出了一种文本挖掘全重复模式检测方法,该方法利用应用程序的反编译文件将可疑应用程序分类到已知的恶意软件家族中。基于使用真实恶意软件数据集的实验结果,该方法尝试对随机选择的3个不同家族的3类恶意软件应用程序进行正确分类(无任何误分类)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信