基于文本挖掘的android可执行文件相似度计算:学生研究摘要

Gyoosik Kim
{"title":"基于文本挖掘的android可执行文件相似度计算:学生研究摘要","authors":"Gyoosik Kim","doi":"10.1145/3019612.3019926","DOIUrl":null,"url":null,"abstract":"According to Comscore1, Android users in the U.S spend an average of 2.8 hours per day using mobile media. On the other hand, according to Statista reports2, Android users were able to choose between 2.2 million applications on June 2016. Among these applications, there are ones reported by Google Android Security Service3 as malware, virus, or illegal theft. Many tools such as Dex2Jar4, apktool5, and jd-gui6 analyze and reverse engineer Android applications and can be used to illegally copy or transform the applications as well. In order to protect applications from piracy or illegal theft, it is necessary to detect theft by measuring application similarity. In the literature, previous studies on theft detection have measured application similarity at two levels, source or executable code level, which have some limitations. Source codes are not available if the codes are legacy one or are developed by upstream suppliers. In the case of the executable codes, application similarity is measured 1) using the source codes decompiled from the executables, or 2) using the characteristics extracted from the executables (i.e., birthmark). For example, DroidMoss [5] applied a fuzzy hashing technique to effectively localize and detect the changes from app-repackaging behavior. Reference [4] proposed software birthmarks to show the unique characteristics of a program and detected software theft based on the birthmarks.","PeriodicalId":20728,"journal":{"name":"Proceedings of the Symposium on Applied Computing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"On computing similarity of android executables using text mining: student research abstract\",\"authors\":\"Gyoosik Kim\",\"doi\":\"10.1145/3019612.3019926\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"According to Comscore1, Android users in the U.S spend an average of 2.8 hours per day using mobile media. On the other hand, according to Statista reports2, Android users were able to choose between 2.2 million applications on June 2016. Among these applications, there are ones reported by Google Android Security Service3 as malware, virus, or illegal theft. Many tools such as Dex2Jar4, apktool5, and jd-gui6 analyze and reverse engineer Android applications and can be used to illegally copy or transform the applications as well. In order to protect applications from piracy or illegal theft, it is necessary to detect theft by measuring application similarity. In the literature, previous studies on theft detection have measured application similarity at two levels, source or executable code level, which have some limitations. Source codes are not available if the codes are legacy one or are developed by upstream suppliers. In the case of the executable codes, application similarity is measured 1) using the source codes decompiled from the executables, or 2) using the characteristics extracted from the executables (i.e., birthmark). For example, DroidMoss [5] applied a fuzzy hashing technique to effectively localize and detect the changes from app-repackaging behavior. Reference [4] proposed software birthmarks to show the unique characteristics of a program and detected software theft based on the birthmarks.\",\"PeriodicalId\":20728,\"journal\":{\"name\":\"Proceedings of the Symposium on Applied Computing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Symposium on Applied Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3019612.3019926\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Symposium on Applied Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3019612.3019926","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

Comscore1的数据显示,美国Android用户平均每天使用移动媒体的时间为2.8小时。另一方面,根据Statista的报告,2016年6月,安卓用户可以选择220万个应用程序。在这些应用程序中,有一些被谷歌安卓安全服务报告为恶意软件、病毒或非法盗窃。许多工具,如Dex2Jar4、apktool5和jd-gui6,可以分析和逆向工程Android应用程序,也可以用于非法复制或转换应用程序。为了保护应用程序免受盗版或非法窃取,有必要通过测量应用程序相似度来检测盗窃。在文献中,以往的盗窃检测研究都是在源代码或可执行代码两个层次上测量应用程序的相似度,存在一定的局限性。如果代码是遗留代码或由上游供应商开发的,则源代码不可用。在可执行代码的情况下,应用程序的相似性是1)使用从可执行文件反编译的源代码,或者2)使用从可执行文件中提取的特征(例如,胎记)来测量的。例如,DroidMoss[5]应用模糊哈希技术来有效地定位和检测应用程序重新包装行为的变化。文献[4]提出了软件胎记来显示程序的独特性,并基于胎记来检测软件盗窃。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
On computing similarity of android executables using text mining: student research abstract
According to Comscore1, Android users in the U.S spend an average of 2.8 hours per day using mobile media. On the other hand, according to Statista reports2, Android users were able to choose between 2.2 million applications on June 2016. Among these applications, there are ones reported by Google Android Security Service3 as malware, virus, or illegal theft. Many tools such as Dex2Jar4, apktool5, and jd-gui6 analyze and reverse engineer Android applications and can be used to illegally copy or transform the applications as well. In order to protect applications from piracy or illegal theft, it is necessary to detect theft by measuring application similarity. In the literature, previous studies on theft detection have measured application similarity at two levels, source or executable code level, which have some limitations. Source codes are not available if the codes are legacy one or are developed by upstream suppliers. In the case of the executable codes, application similarity is measured 1) using the source codes decompiled from the executables, or 2) using the characteristics extracted from the executables (i.e., birthmark). For example, DroidMoss [5] applied a fuzzy hashing technique to effectively localize and detect the changes from app-repackaging behavior. Reference [4] proposed software birthmarks to show the unique characteristics of a program and detected software theft based on the birthmarks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信