Unwanted SMTP Paths and Relays

Srikanth Palla, R. Dantu
{"title":"Unwanted SMTP Paths and Relays","authors":"Srikanth Palla, R. Dantu","doi":"10.1109/COMSWA.2007.382440","DOIUrl":null,"url":null,"abstract":"Based on the social interactions of an email user, incoming email traffic can be divided into different categories such as, telemarketing, Opt-in family members and friends. Due to a lack of knowledge in the different categories, most of the existing spam filters are prone to high false positives and false negatives. Moreover, a majority of the spammers obfuscate their email content inorder to circumvent the content-based spam filters. However, they do not have access to all the fields in the email header. Our classification method is based on the path traversed by email (instead of content analysis) since we believe that spammers cannot forge all the fields in the email header. We based our classification on three kinds of analyses on the header: i) EndToEnd path analysis, which tries to establish the legitimacy of the path taken by an email and classifies them as either spam or non-spam; ii) Relay analysis, which verifies the trustworthiness of the relays participating in the relaying of emails; iii) Emails wantedness analysis, which measure the recipients wantedness of the senders emails. We use the IMAP message status flags such as, message has been read, deleted, answered, flagged, and draft as an implicit feed back from the user in Emails wantedness analysis. Finally we classify the incoming emails as i) socially close (such as, legitimate emails from family, and friends), ii) socially distinct emails from strangers, iii) spam emails (for example, emails from telemarketers, and spammers) and iv) opt-in emails. Based on the relation between spamminess of the path taken by spam emails and the unwantedness values of the spammers, we classify spammers as i) prospective spammers, ii) suspects, iii) recent spammers and iv) serial spammers. Overall, our method resulted in far less false positives compared to current filters like SpamAssassin. We achieved a precision of 98.65% which is better than the precisions achieved by SPF and DNSBL blacklists.","PeriodicalId":191295,"journal":{"name":"2007 2nd International Conference on Communication Systems Software and Middleware","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 2nd International Conference on Communication Systems Software and Middleware","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMSWA.2007.382440","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Based on the social interactions of an email user, incoming email traffic can be divided into different categories such as, telemarketing, Opt-in family members and friends. Due to a lack of knowledge in the different categories, most of the existing spam filters are prone to high false positives and false negatives. Moreover, a majority of the spammers obfuscate their email content inorder to circumvent the content-based spam filters. However, they do not have access to all the fields in the email header. Our classification method is based on the path traversed by email (instead of content analysis) since we believe that spammers cannot forge all the fields in the email header. We based our classification on three kinds of analyses on the header: i) EndToEnd path analysis, which tries to establish the legitimacy of the path taken by an email and classifies them as either spam or non-spam; ii) Relay analysis, which verifies the trustworthiness of the relays participating in the relaying of emails; iii) Emails wantedness analysis, which measure the recipients wantedness of the senders emails. We use the IMAP message status flags such as, message has been read, deleted, answered, flagged, and draft as an implicit feed back from the user in Emails wantedness analysis. Finally we classify the incoming emails as i) socially close (such as, legitimate emails from family, and friends), ii) socially distinct emails from strangers, iii) spam emails (for example, emails from telemarketers, and spammers) and iv) opt-in emails. Based on the relation between spamminess of the path taken by spam emails and the unwantedness values of the spammers, we classify spammers as i) prospective spammers, ii) suspects, iii) recent spammers and iv) serial spammers. Overall, our method resulted in far less false positives compared to current filters like SpamAssassin. We achieved a precision of 98.65% which is better than the precisions achieved by SPF and DNSBL blacklists.
不需要的SMTP路径和中继
根据电子邮件用户的社交互动,收到的电子邮件流量可以分为不同的类别,如电话营销、家庭成员和朋友的选择。由于缺乏对不同类别的了解,大多数现有的垃圾邮件过滤器容易出现高误报和误报。此外,大多数垃圾邮件发送者混淆了他们的电子邮件内容,以绕过基于内容的垃圾邮件过滤器。但是,他们不能访问电子邮件标题中的所有字段。我们的分类方法是基于电子邮件所经过的路径(而不是内容分析),因为我们认为垃圾邮件发送者无法伪造电子邮件头中的所有字段。我们基于对标题的三种分析进行分类:i) EndToEnd路径分析,它试图建立电子邮件所采取的路径的合法性,并将其分类为垃圾邮件或非垃圾邮件;ii)中继分析,验证参与邮件中继的中继者的可信度;iii)电子邮件需求分析,衡量收件人对发件人电子邮件的需求。在电子邮件需求分析中,我们使用IMAP消息状态标志(如消息已读取、已删除、已回答、已标记和已起草)作为来自用户的隐式反馈。最后,我们将收到的电子邮件分类为i)社交密切(例如,来自家人和朋友的合法电子邮件),ii)来自陌生人的社交不同电子邮件,iii)垃圾邮件(例如,来自电话营销人员和垃圾邮件发送者的电子邮件)和iv)选择加入电子邮件。根据滥发电邮路径的滥发性与滥发电邮者的不受欢迎程度之间的关系,我们将滥发电邮者分为i)准滥发电邮者、ii)怀疑滥发电邮者、iii)最近滥发电邮者及iv)连续滥发电邮者。总的来说,与SpamAssassin等当前过滤器相比,我们的方法产生的误报要少得多。准确度达到98.65%,优于SPF和DNSBL黑名单的准确度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信