{"title":"Identification of pharming in communication networks using ensemble learning","authors":"N. Azeez, S. Oladele, O. Ologe","doi":"10.4314/njtd.v19i2.10","DOIUrl":null,"url":null,"abstract":"Pharming scams are carried out by exploiting the DNS as the main weapon while phishing attacks employ spoofed websites that appear to be legitimate to internet users. Phishing makes use of baits such as fake links but pharming leverages and negotiates on the DNS server to move and redirect internet users to a fake and simulated website.Having seen several challenges through pharming resulting into vulnerable websites, personal emails and accounts on social media, the usage and reliability on internet calls for caution. Against this backdrop, this work aims at enhancing pharming detection strategies by adopting machine learning classification algorithms. To further obtain the best classification results, an ensemble learning approach was adopted. The algorithms used include K-Nearest Neighbors (KNN), Decision Tree, Random Forest, Gaussian Naive Bayes, Logistic Regression, Support Vector Machine, Adaptive Boosting, Gradient Boosting, and Extra Trees Classifier. During the testing process, the classifiers were tested against four popular metrics: accuracy, recall, precision, F1 score, and Log loss. The results demonstrate the performance of all algorithms used, as well as their relationships. The ensemble model that included Logistic Regression, K-Nearest Neighbors, Decision Tree, Support Vector Machine, Gradient Boosting Classifier, AdaBoost Classifier, Extra Trees Classifier, and Random Forest produced the best results after evaluating them on the two datasets. Random Forest Classifiers showed a better performance of the classifiers, with mean accuracies of 0.932 and 0.939, respectively for each of the datasets when compared to 0.476 and 0.519 obtained for Naive Bayes.","PeriodicalId":31273,"journal":{"name":"Nigerian Journal of Technological Development","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nigerian Journal of Technological Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4314/njtd.v19i2.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 0
Abstract
Pharming scams are carried out by exploiting the DNS as the main weapon while phishing attacks employ spoofed websites that appear to be legitimate to internet users. Phishing makes use of baits such as fake links but pharming leverages and negotiates on the DNS server to move and redirect internet users to a fake and simulated website.Having seen several challenges through pharming resulting into vulnerable websites, personal emails and accounts on social media, the usage and reliability on internet calls for caution. Against this backdrop, this work aims at enhancing pharming detection strategies by adopting machine learning classification algorithms. To further obtain the best classification results, an ensemble learning approach was adopted. The algorithms used include K-Nearest Neighbors (KNN), Decision Tree, Random Forest, Gaussian Naive Bayes, Logistic Regression, Support Vector Machine, Adaptive Boosting, Gradient Boosting, and Extra Trees Classifier. During the testing process, the classifiers were tested against four popular metrics: accuracy, recall, precision, F1 score, and Log loss. The results demonstrate the performance of all algorithms used, as well as their relationships. The ensemble model that included Logistic Regression, K-Nearest Neighbors, Decision Tree, Support Vector Machine, Gradient Boosting Classifier, AdaBoost Classifier, Extra Trees Classifier, and Random Forest produced the best results after evaluating them on the two datasets. Random Forest Classifiers showed a better performance of the classifiers, with mean accuracies of 0.932 and 0.939, respectively for each of the datasets when compared to 0.476 and 0.519 obtained for Naive Bayes.