Identification of pharming in communication networks using ensemble learning

Q4 Engineering
N. Azeez, S. Oladele, O. Ologe
{"title":"Identification of pharming in communication networks using ensemble learning","authors":"N. Azeez, S. Oladele, O. Ologe","doi":"10.4314/njtd.v19i2.10","DOIUrl":null,"url":null,"abstract":"Pharming scams are carried out by exploiting the DNS as the main weapon while phishing attacks employ spoofed websites that appear to be legitimate to internet users. Phishing makes use of baits such as fake links but pharming leverages and negotiates on the DNS server to move and redirect internet users to a fake and simulated website.Having seen several challenges through pharming resulting into vulnerable websites, personal emails and accounts on social media, the usage and reliability on internet calls for caution. Against this backdrop, this work aims at enhancing pharming detection strategies by adopting machine learning classification algorithms. To further obtain the best classification results, an ensemble learning approach was adopted. The algorithms used include K-Nearest Neighbors (KNN), Decision Tree, Random Forest, Gaussian Naive Bayes, Logistic Regression, Support Vector Machine, Adaptive Boosting, Gradient Boosting, and Extra Trees Classifier. During the testing process, the classifiers were tested against four popular metrics: accuracy, recall, precision, F1 score, and Log loss. The results demonstrate the performance of all algorithms used, as well as their relationships. The ensemble model that included Logistic Regression, K-Nearest Neighbors, Decision Tree, Support Vector Machine, Gradient Boosting Classifier, AdaBoost Classifier, Extra Trees Classifier, and Random Forest produced the best results after evaluating them on the two datasets. Random Forest Classifiers showed a better performance of the classifiers, with mean accuracies of 0.932 and 0.939, respectively for each of the datasets when compared to 0.476 and 0.519 obtained for Naive Bayes.","PeriodicalId":31273,"journal":{"name":"Nigerian Journal of Technological Development","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nigerian Journal of Technological Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4314/njtd.v19i2.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 0

Abstract

Pharming scams are carried out by exploiting the DNS as the main weapon while phishing attacks employ spoofed websites that appear to be legitimate to internet users. Phishing makes use of baits such as fake links but pharming leverages and negotiates on the DNS server to move and redirect internet users to a fake and simulated website.Having seen several challenges through pharming resulting into vulnerable websites, personal emails and accounts on social media, the usage and reliability on internet calls for caution. Against this backdrop, this work aims at enhancing pharming detection strategies by adopting machine learning classification algorithms. To further obtain the best classification results, an ensemble learning approach was adopted. The algorithms used include K-Nearest Neighbors (KNN), Decision Tree, Random Forest, Gaussian Naive Bayes, Logistic Regression, Support Vector Machine, Adaptive Boosting, Gradient Boosting, and Extra Trees Classifier. During the testing process, the classifiers were tested against four popular metrics: accuracy, recall, precision, F1 score, and Log loss. The results demonstrate the performance of all algorithms used, as well as their relationships. The ensemble model that included Logistic Regression, K-Nearest Neighbors, Decision Tree, Support Vector Machine, Gradient Boosting Classifier, AdaBoost Classifier, Extra Trees Classifier, and Random Forest produced the best results after evaluating them on the two datasets. Random Forest Classifiers showed a better performance of the classifiers, with mean accuracies of 0.932 and 0.939, respectively for each of the datasets when compared to 0.476 and 0.519 obtained for Naive Bayes.
使用集成学习识别通信网络中的群集
仿冒诈骗是利用DNS作为主要武器,而网络钓鱼攻击则利用对互联网用户来说是合法的欺骗网站。网络钓鱼利用虚假链接等诱饵,利用DNS服务器进行协商,将互联网用户移动并重定向到虚假和模拟的网站。在经历了几次恶意攻击导致易受攻击的网站、个人电子邮件和社交媒体账户的挑战后,互联网的使用和可靠性要求我们保持谨慎。在此背景下,本工作旨在通过采用机器学习分类算法来增强药物检测策略。为了进一步获得最佳分类结果,采用了集成学习方法。使用的算法包括k近邻(KNN)、决策树、随机森林、高斯朴素贝叶斯、逻辑回归、支持向量机、自适应增强、梯度增强和额外树分类器。在测试过程中,针对四个流行的指标对分类器进行了测试:准确性、召回率、精度、F1分数和Log损失。结果证明了所使用的所有算法的性能,以及它们之间的关系。集成模型包括Logistic回归、k近邻、决策树、支持向量机、梯度增强分类器、AdaBoost分类器、额外树分类器和随机森林,在两个数据集上进行了评估,得到了最好的结果。随机森林分类器表现出更好的分类器性能,每个数据集的平均准确率分别为0.932和0.939,而朴素贝叶斯的平均准确率分别为0.476和0.519。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Nigerian Journal of Technological Development
Nigerian Journal of Technological Development Engineering-Engineering (miscellaneous)
CiteScore
1.00
自引率
0.00%
发文量
40
审稿时长
24 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信