利用e-STAT加强对尼日利亚欺诈性电子邮件的内容分析

O. Longe, .A. Abayomi-Alli, I. O. Shaib, F. A. Longe
{"title":"利用e-STAT加强对尼日利亚欺诈性电子邮件的内容分析","authors":"O. Longe, .A. Abayomi-Alli, I. O. Shaib, F. A. Longe","doi":"10.1109/ICASTECH.2009.5409717","DOIUrl":null,"url":null,"abstract":"A large percentage of fraudulent spam mails are believed to originate from Nigeria or from Nigerians in remote locations. These mails (popularly referred to as 419 spam) come in broad categories but all with the intent of defrauding the recipients'. Testing the validity of senders and receivers address is one method that has been used to filter spam mails. This approach will not filter out ordinary e-mails since typical e-mail users will always include their true e-mail addresses to facilitate replies. Checking the IP-addresses of 419 mails is a way of ascertaining their actual origin. This can be done with the intention to build a database of e-mail abuse or to blacklist addresses from which fraudulent mails are originating keeping in mind that blacklisted IP addresses could be used to stop the delivery of further mails from such addresses in the future. To this end, this research examines features selected specifically from the content analysis of Nigeria spam e-mail. A domain specific statistical content analysis tool (e-STAT) was developed and implemented using Bayesian statistical technique. The software was tested and trained with a sizeable balanced corpus of Nigerian 419 spam e-mails and normal (ham) e-mails. Analysis of classified mails using e-STAT showed that current concept drift patterns among Nigerian 419 spammers and provided a blacklist of about 2,173 e-mail sender's addresses, 563 URLs within spam mails and a total of 13,491 bag-of-words common to Nigerian spam e-mails. The research obtained results that will guide future research in the domain of 419 mails in designing effective spam filters and electronic mail classifiers.","PeriodicalId":163141,"journal":{"name":"2009 2nd International Conference on Adaptive Science & Technology (ICAST)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Enhanced content analysis of fraudulent Nigeria electronic mails using e-STAT\",\"authors\":\"O. Longe, .A. Abayomi-Alli, I. O. Shaib, F. A. Longe\",\"doi\":\"10.1109/ICASTECH.2009.5409717\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A large percentage of fraudulent spam mails are believed to originate from Nigeria or from Nigerians in remote locations. These mails (popularly referred to as 419 spam) come in broad categories but all with the intent of defrauding the recipients'. Testing the validity of senders and receivers address is one method that has been used to filter spam mails. This approach will not filter out ordinary e-mails since typical e-mail users will always include their true e-mail addresses to facilitate replies. Checking the IP-addresses of 419 mails is a way of ascertaining their actual origin. This can be done with the intention to build a database of e-mail abuse or to blacklist addresses from which fraudulent mails are originating keeping in mind that blacklisted IP addresses could be used to stop the delivery of further mails from such addresses in the future. To this end, this research examines features selected specifically from the content analysis of Nigeria spam e-mail. A domain specific statistical content analysis tool (e-STAT) was developed and implemented using Bayesian statistical technique. The software was tested and trained with a sizeable balanced corpus of Nigerian 419 spam e-mails and normal (ham) e-mails. Analysis of classified mails using e-STAT showed that current concept drift patterns among Nigerian 419 spammers and provided a blacklist of about 2,173 e-mail sender's addresses, 563 URLs within spam mails and a total of 13,491 bag-of-words common to Nigerian spam e-mails. The research obtained results that will guide future research in the domain of 419 mails in designing effective spam filters and electronic mail classifiers.\",\"PeriodicalId\":163141,\"journal\":{\"name\":\"2009 2nd International Conference on Adaptive Science & Technology (ICAST)\",\"volume\":\"104 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 2nd International Conference on Adaptive Science & Technology (ICAST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASTECH.2009.5409717\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 2nd International Conference on Adaptive Science & Technology (ICAST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASTECH.2009.5409717","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

据信,大部分欺诈性垃圾邮件来自尼日利亚或偏远地区的尼日利亚人。这些邮件(通常被称为419垃圾邮件)种类繁多,但都有欺骗收件人的目的。测试发件人和收件人地址的有效性是过滤垃圾邮件的一种方法。这种方法不会过滤掉普通的电子邮件,因为典型的电子邮件用户总是包含他们的真实电子邮件地址,以便于回复。检查419封邮件的ip地址是确定其真实来源的一种方法。这样做的目的是建立一个滥用电子邮件的数据库,或将发送欺诈邮件的地址列入黑名单,要记住,列入黑名单的IP地址可以用来阻止将来从这些地址发送更多的邮件。为此,本研究考察了尼日利亚垃圾邮件内容分析中特别选择的特征。利用贝叶斯统计技术,开发并实现了一个特定领域的统计内容分析工具(e-STAT)。该软件进行了测试,并与尼日利亚419垃圾邮件和正常(火腿)电子邮件的相当大的平衡语料库训练。利用e-STAT对分类邮件进行分析,显示了尼日利亚419个垃圾邮件发送者目前的概念漂移模式,并提供了约2173个电子邮件发送者地址的黑名单、垃圾邮件中的563个url和尼日利亚垃圾邮件共有的13491个词袋。研究结果将指导未来在419邮件领域设计有效的垃圾邮件过滤器和电子邮件分类器的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Enhanced content analysis of fraudulent Nigeria electronic mails using e-STAT
A large percentage of fraudulent spam mails are believed to originate from Nigeria or from Nigerians in remote locations. These mails (popularly referred to as 419 spam) come in broad categories but all with the intent of defrauding the recipients'. Testing the validity of senders and receivers address is one method that has been used to filter spam mails. This approach will not filter out ordinary e-mails since typical e-mail users will always include their true e-mail addresses to facilitate replies. Checking the IP-addresses of 419 mails is a way of ascertaining their actual origin. This can be done with the intention to build a database of e-mail abuse or to blacklist addresses from which fraudulent mails are originating keeping in mind that blacklisted IP addresses could be used to stop the delivery of further mails from such addresses in the future. To this end, this research examines features selected specifically from the content analysis of Nigeria spam e-mail. A domain specific statistical content analysis tool (e-STAT) was developed and implemented using Bayesian statistical technique. The software was tested and trained with a sizeable balanced corpus of Nigerian 419 spam e-mails and normal (ham) e-mails. Analysis of classified mails using e-STAT showed that current concept drift patterns among Nigerian 419 spammers and provided a blacklist of about 2,173 e-mail sender's addresses, 563 URLs within spam mails and a total of 13,491 bag-of-words common to Nigerian spam e-mails. The research obtained results that will guide future research in the domain of 419 mails in designing effective spam filters and electronic mail classifiers.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信