{"title":"基于单一和混合集成学习的钓鱼网站检测:检查不同性质数据集的影响和信息特征选择技术","authors":"Kibreab Adane, Berhanu Beyene, Mohammed Abebe","doi":"10.1145/3611392","DOIUrl":null,"url":null,"abstract":"To tackle issues associated with phishing website attacks, the study conducted rigorous experiments on RF, GB, and CATB classifiers. Since each classifier was an ensemble learner on their own; we integrated them into stacking and majority vote ensemble architectures to create hybrid-ensemble learning. Due to ensemble learning methods being known for their high computational time costs, the study applied the UFS technique to address these concerns and obtained promising results. Since the scalability and performance consistency of the phishing website detection system across numerous datasets is critical to combating various variants of phishing website attacks, we used three distinct phishing website datasets (DS-1, DS-2, and DS-3) to train and test each ensemble learning method to identify the best-performed one in terms of accuracy and model computational time. Our experimental findings reveal that the CATB classifier demonstrated scalable, consistent, and superior accuracy across three distinct datasets (attained 97.9% accuracy in DS-1, 97.36% accuracy in DS-2, and 98.59% accuracy in DS-3). When it comes to model computational time, the RF classifier was discovered to be the fastest when applied to all datasets, while the CATB classifier was discovered to be the second quickest when applied to all datasets.","PeriodicalId":202552,"journal":{"name":"Digital Threats: Research and Practice","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Single and Hybrid-Ensemble Learning-Based Phishing Website Detection: Examining Impacts of Varied Nature Datasets and Informative Feature Selection Technique\",\"authors\":\"Kibreab Adane, Berhanu Beyene, Mohammed Abebe\",\"doi\":\"10.1145/3611392\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To tackle issues associated with phishing website attacks, the study conducted rigorous experiments on RF, GB, and CATB classifiers. Since each classifier was an ensemble learner on their own; we integrated them into stacking and majority vote ensemble architectures to create hybrid-ensemble learning. Due to ensemble learning methods being known for their high computational time costs, the study applied the UFS technique to address these concerns and obtained promising results. Since the scalability and performance consistency of the phishing website detection system across numerous datasets is critical to combating various variants of phishing website attacks, we used three distinct phishing website datasets (DS-1, DS-2, and DS-3) to train and test each ensemble learning method to identify the best-performed one in terms of accuracy and model computational time. Our experimental findings reveal that the CATB classifier demonstrated scalable, consistent, and superior accuracy across three distinct datasets (attained 97.9% accuracy in DS-1, 97.36% accuracy in DS-2, and 98.59% accuracy in DS-3). When it comes to model computational time, the RF classifier was discovered to be the fastest when applied to all datasets, while the CATB classifier was discovered to be the second quickest when applied to all datasets.\",\"PeriodicalId\":202552,\"journal\":{\"name\":\"Digital Threats: Research and Practice\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Threats: Research and Practice\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3611392\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Threats: Research and Practice","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3611392","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Single and Hybrid-Ensemble Learning-Based Phishing Website Detection: Examining Impacts of Varied Nature Datasets and Informative Feature Selection Technique
To tackle issues associated with phishing website attacks, the study conducted rigorous experiments on RF, GB, and CATB classifiers. Since each classifier was an ensemble learner on their own; we integrated them into stacking and majority vote ensemble architectures to create hybrid-ensemble learning. Due to ensemble learning methods being known for their high computational time costs, the study applied the UFS technique to address these concerns and obtained promising results. Since the scalability and performance consistency of the phishing website detection system across numerous datasets is critical to combating various variants of phishing website attacks, we used three distinct phishing website datasets (DS-1, DS-2, and DS-3) to train and test each ensemble learning method to identify the best-performed one in terms of accuracy and model computational time. Our experimental findings reveal that the CATB classifier demonstrated scalable, consistent, and superior accuracy across three distinct datasets (attained 97.9% accuracy in DS-1, 97.36% accuracy in DS-2, and 98.59% accuracy in DS-3). When it comes to model computational time, the RF classifier was discovered to be the fastest when applied to all datasets, while the CATB classifier was discovered to be the second quickest when applied to all datasets.