Spam ain't as diverse as it seems: throttling OSN spam with templates underneath

Hongyu Gao, Yi Yang, Kai Bu, Yan Chen, Doug Downey, Kathy Lee, A. Choudhary
{"title":"Spam ain't as diverse as it seems: throttling OSN spam with templates underneath","authors":"Hongyu Gao, Yi Yang, Kai Bu, Yan Chen, Doug Downey, Kathy Lee, A. Choudhary","doi":"10.1145/2664243.2664251","DOIUrl":null,"url":null,"abstract":"In online social networks (OSNs), spam originating from friends and acquaintances not only reduces the joy of Internet surfing but also causes damage to less security-savvy users. Prior countermeasures combat OSN spam from different angles. Due to the diversity of spam, there is hardly any existing method that can independently detect the majority or most of OSN spam. In this paper, we empirically analyze the textual pattern of a large collection of OSN spam. An inspiring finding is that the majority (63.0%) of the collected spam is generated with underlying templates. We therefore propose extracting templates of spam detected by existing methods and then matching messages against the templates toward accurate and fast spam detection. We implement this insight through Tangram, an OSN spam filtering system that performs online inspection on the stream of user-generated messages. Tangram automatically divides OSN spam into segments and uses the segments to construct templates to filter future spam. Experimental results show that Tangram is highly accurate and can rapidly generate templates to throttle newly emerged campaigns. Specifically, Tangram detects the most prevalent template-based spam with 95.7% true positive rate, whereas the existing template generation approach detects only 32.3%. The integration of Tangram and its auxiliary spam filter achieves an overall accuracy of 85.4% true positive rate and 0.33% false positive rate.","PeriodicalId":104443,"journal":{"name":"Proceedings of the 30th Annual Computer Security Applications Conference","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"31","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th Annual Computer Security Applications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2664243.2664251","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 31

Abstract

In online social networks (OSNs), spam originating from friends and acquaintances not only reduces the joy of Internet surfing but also causes damage to less security-savvy users. Prior countermeasures combat OSN spam from different angles. Due to the diversity of spam, there is hardly any existing method that can independently detect the majority or most of OSN spam. In this paper, we empirically analyze the textual pattern of a large collection of OSN spam. An inspiring finding is that the majority (63.0%) of the collected spam is generated with underlying templates. We therefore propose extracting templates of spam detected by existing methods and then matching messages against the templates toward accurate and fast spam detection. We implement this insight through Tangram, an OSN spam filtering system that performs online inspection on the stream of user-generated messages. Tangram automatically divides OSN spam into segments and uses the segments to construct templates to filter future spam. Experimental results show that Tangram is highly accurate and can rapidly generate templates to throttle newly emerged campaigns. Specifically, Tangram detects the most prevalent template-based spam with 95.7% true positive rate, whereas the existing template generation approach detects only 32.3%. The integration of Tangram and its auxiliary spam filter achieves an overall accuracy of 85.4% true positive rate and 0.33% false positive rate.
垃圾邮件并不像看起来那么多样化:在下面用模板限制OSN垃圾邮件
在在线社交网络(osn)中,来自朋友和熟人的垃圾邮件不仅会降低上网的乐趣,还会对不太安全的用户造成损害。前期对策从不同角度打击OSN垃圾邮件。由于垃圾邮件的多样性,几乎没有任何现有的方法可以独立检测大部分或大部分的OSN垃圾邮件。本文对大量OSN垃圾邮件的文本模式进行了实证分析。一个令人鼓舞的发现是,大多数(63.0%)收集到的垃圾邮件是使用底层模板生成的。因此,我们提出提取现有方法检测到的垃圾邮件模板,然后将邮件与模板进行匹配,以实现准确、快速的垃圾邮件检测。我们通过Tangram实现这种洞察力,这是一个OSN垃圾邮件过滤系统,对用户生成的消息流执行在线检查。Tangram自动将OSN垃圾邮件划分为多个段,并根据这些段构建模板,过滤以后的垃圾邮件。实验结果表明,该算法具有较高的准确率,能够快速生成模板来抑制新出现的活动。具体来说,Tangram检测到最普遍的基于模板的垃圾邮件的真阳性率为95.7%,而现有的模板生成方法仅检测到32.3%。Tangram及其辅助垃圾邮件过滤器的集成实现了85.4%的真阳性率和0.33%的假阳性率的总体准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书