Md. Faiyed Bin Karim, Tasnimul Hasan, Nushera Tazreen, Safayat Bin Hakim, Samiha Tarannum
{"title":"基于降低复杂度的机器学习网络钓鱼网站检测技术研究","authors":"Md. Faiyed Bin Karim, Tasnimul Hasan, Nushera Tazreen, Safayat Bin Hakim, Samiha Tarannum","doi":"10.1109/CyberneticsCom55287.2022.9865297","DOIUrl":null,"url":null,"abstract":"In today's digital age, one of the predominant causes of the security breaches is phishing web sites that disguise them-selves as legitimate web sites and trick unsuspecting users into revealing sensitive information. With the proliferation of high-speed internet and the popularization of IT education, there is an increase in unscrupulous actors on the web who are always ready to counterfeit a legitimate website and use it to deceive and ma-nipulate users. Software and non-software-based techniques have been used to try to unmask the phishers. Phishing web sites have many characteristics in them. Thus, classifying and detecting those is unavoidably time-consuming and complex. Our research analyzed several hybrid machine learning models, including a bespoke preprocessing step of reducing minimally correlated features and then training with four boosting algorithms and three SVM models for classification. These models have also been trained after hyperparameter tuning. Among the investigated models, XGBoost brought the highest accuracy of 97.0455% after the hyperparameter tuning.","PeriodicalId":178279,"journal":{"name":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","volume":"44 15","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An investigation of ML techniques to detect Phishing Websites by complexity reduction\",\"authors\":\"Md. Faiyed Bin Karim, Tasnimul Hasan, Nushera Tazreen, Safayat Bin Hakim, Samiha Tarannum\",\"doi\":\"10.1109/CyberneticsCom55287.2022.9865297\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In today's digital age, one of the predominant causes of the security breaches is phishing web sites that disguise them-selves as legitimate web sites and trick unsuspecting users into revealing sensitive information. With the proliferation of high-speed internet and the popularization of IT education, there is an increase in unscrupulous actors on the web who are always ready to counterfeit a legitimate website and use it to deceive and ma-nipulate users. Software and non-software-based techniques have been used to try to unmask the phishers. Phishing web sites have many characteristics in them. Thus, classifying and detecting those is unavoidably time-consuming and complex. Our research analyzed several hybrid machine learning models, including a bespoke preprocessing step of reducing minimally correlated features and then training with four boosting algorithms and three SVM models for classification. These models have also been trained after hyperparameter tuning. Among the investigated models, XGBoost brought the highest accuracy of 97.0455% after the hyperparameter tuning.\",\"PeriodicalId\":178279,\"journal\":{\"name\":\"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)\",\"volume\":\"44 15\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CyberneticsCom55287.2022.9865297\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CyberneticsCom55287.2022.9865297","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An investigation of ML techniques to detect Phishing Websites by complexity reduction
In today's digital age, one of the predominant causes of the security breaches is phishing web sites that disguise them-selves as legitimate web sites and trick unsuspecting users into revealing sensitive information. With the proliferation of high-speed internet and the popularization of IT education, there is an increase in unscrupulous actors on the web who are always ready to counterfeit a legitimate website and use it to deceive and ma-nipulate users. Software and non-software-based techniques have been used to try to unmask the phishers. Phishing web sites have many characteristics in them. Thus, classifying and detecting those is unavoidably time-consuming and complex. Our research analyzed several hybrid machine learning models, including a bespoke preprocessing step of reducing minimally correlated features and then training with four boosting algorithms and three SVM models for classification. These models have also been trained after hyperparameter tuning. Among the investigated models, XGBoost brought the highest accuracy of 97.0455% after the hyperparameter tuning.