基于聚类和贝叶斯方法的网络钓鱼站点检测混合模型

Rahul Patil, Bhushan Dasharath Dhamdhere, Kaushal Sudhakar Dhonde, Rohit Gopal Chinchwade, Swapnil Balasaheb Mehetre
{"title":"基于聚类和贝叶斯方法的网络钓鱼站点检测混合模型","authors":"Rahul Patil, Bhushan Dasharath Dhamdhere, Kaushal Sudhakar Dhonde, Rohit Gopal Chinchwade, Swapnil Balasaheb Mehetre","doi":"10.1109/I2CT.2014.7092141","DOIUrl":null,"url":null,"abstract":"Phishing sites are the major attacks by which most of internet users are being fooled by the phisher. The replicas of the legitimate sites are created and users are directed to that web site by luring some offers to it. There are certain standards which are given by W3C (World Wide Web Consortium), based on these standards we are choosing some features which can easily describe the difference between legit site and phish site. We are proposing a model to determine the phishing sites to safeguard the web users from phisher. The features of URL along with the features of Web Page in HTML tags are considered to determine the attack. Here Clustering of Database is done through K-Means Clustering and Naive Bayes Classifier prediction technique is applied to determine the probability of the web site as Valid Phish or Invalid Phish. K-Means Clustering is applied on initial URL features and Validity is checked if still we are not able to determine the Validity of Web Site then Naive Bayes Classifier is applied onto URL as well as HTML tag features of Site and probability is evaluated based on training model.","PeriodicalId":384966,"journal":{"name":"International Conference for Convergence for Technology-2014","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"A hybrid model to detect phishing-sites using clustering and Bayesian approach\",\"authors\":\"Rahul Patil, Bhushan Dasharath Dhamdhere, Kaushal Sudhakar Dhonde, Rohit Gopal Chinchwade, Swapnil Balasaheb Mehetre\",\"doi\":\"10.1109/I2CT.2014.7092141\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Phishing sites are the major attacks by which most of internet users are being fooled by the phisher. The replicas of the legitimate sites are created and users are directed to that web site by luring some offers to it. There are certain standards which are given by W3C (World Wide Web Consortium), based on these standards we are choosing some features which can easily describe the difference between legit site and phish site. We are proposing a model to determine the phishing sites to safeguard the web users from phisher. The features of URL along with the features of Web Page in HTML tags are considered to determine the attack. Here Clustering of Database is done through K-Means Clustering and Naive Bayes Classifier prediction technique is applied to determine the probability of the web site as Valid Phish or Invalid Phish. K-Means Clustering is applied on initial URL features and Validity is checked if still we are not able to determine the Validity of Web Site then Naive Bayes Classifier is applied onto URL as well as HTML tag features of Site and probability is evaluated based on training model.\",\"PeriodicalId\":384966,\"journal\":{\"name\":\"International Conference for Convergence for Technology-2014\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-04-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference for Convergence for Technology-2014\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/I2CT.2014.7092141\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference for Convergence for Technology-2014","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/I2CT.2014.7092141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

网络钓鱼网站是大多数互联网用户被网络钓鱼者欺骗的主要攻击方式。合法网站的复制品被创建,用户被引导到该网站通过引诱一些报价。W3C(万维网联盟)给出了一定的标准,基于这些标准,我们选择了一些特征,可以很容易地描述合法网站和钓鱼网站之间的区别。我们提出了一个确定网络钓鱼站点的模型,以保护网络用户免受网络钓鱼者的攻击。考虑了URL的特征以及HTML标签中Web Page的特征来判断攻击。这里通过K-Means聚类对数据库进行聚类,并应用朴素贝叶斯分类器预测技术来确定网站为Valid Phish或Invalid Phish的概率。对初始URL特征进行k均值聚类,如果仍然不能确定网站的有效性,则对URL和网站的HTML标签特征进行朴素贝叶斯分类器,并基于训练模型评估概率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A hybrid model to detect phishing-sites using clustering and Bayesian approach
Phishing sites are the major attacks by which most of internet users are being fooled by the phisher. The replicas of the legitimate sites are created and users are directed to that web site by luring some offers to it. There are certain standards which are given by W3C (World Wide Web Consortium), based on these standards we are choosing some features which can easily describe the difference between legit site and phish site. We are proposing a model to determine the phishing sites to safeguard the web users from phisher. The features of URL along with the features of Web Page in HTML tags are considered to determine the attack. Here Clustering of Database is done through K-Means Clustering and Naive Bayes Classifier prediction technique is applied to determine the probability of the web site as Valid Phish or Invalid Phish. K-Means Clustering is applied on initial URL features and Validity is checked if still we are not able to determine the Validity of Web Site then Naive Bayes Classifier is applied onto URL as well as HTML tag features of Site and probability is evaluated based on training model.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信