Framework for detecting phishing crimes on Twitter using selective features and machine learning

IF 4 3区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computers & Electrical Engineering Pub Date : 2025-04-28 DOI:10.1016/j.compeleceng.2025.110363

Hina Rashid , Hannan Bin Liaqat , Muhammad Usman Sana , Tayybah Kiren , Hanen Karamti , Imran Ashraf

{"title":"Framework for detecting phishing crimes on Twitter using selective features and machine learning","authors":"Hina Rashid , Hannan Bin Liaqat , Muhammad Usman Sana , Tayybah Kiren , Hanen Karamti , Imran Ashraf","doi":"10.1016/j.compeleceng.2025.110363","DOIUrl":null,"url":null,"abstract":"<div><div>Socially aware information technology (SIT) plays a preferential role in facilitating the users for different tasks. Social media phishing is an escalating cybersecurity threat, where attackers employ deceptive tricks to steal personal data. Phishing detection in real-time is crucial, and highly dependent upon the selection of the most relevant features. Exiting literature often depends upon manual or random feature selection leading to inefficiencies in classification results. This research introduces a hybrid machine learning approach to phishing detection based on three feature selection methods Relief, Chi-square, and extra tree classifier for determining the most important features. Five classifiers including Naïve Bayes, support vector machine, decision tree, random forest (RF), and logistic regression are assessed based on accuracy, precision, recall, F1 score, and area under the curve (AUC). Experimental results indicate that RF obtains the highest accuracy of 95.56% and an AUC of 99.00%, better than other models and previous works. The results demonstrate the efficiency of the proposed method in improving phishing detection on social media.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"124 ","pages":"Article 110363"},"PeriodicalIF":4.0000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625003064","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Socially aware information technology (SIT) plays a preferential role in facilitating the users for different tasks. Social media phishing is an escalating cybersecurity threat, where attackers employ deceptive tricks to steal personal data. Phishing detection in real-time is crucial, and highly dependent upon the selection of the most relevant features. Exiting literature often depends upon manual or random feature selection leading to inefficiencies in classification results. This research introduces a hybrid machine learning approach to phishing detection based on three feature selection methods Relief, Chi-square, and extra tree classifier for determining the most important features. Five classifiers including Naïve Bayes, support vector machine, decision tree, random forest (RF), and logistic regression are assessed based on accuracy, precision, recall, F1 score, and area under the curve (AUC). Experimental results indicate that RF obtains the highest accuracy of 95.56% and an AUC of 99.00%, better than other models and previous works. The results demonstrate the efficiency of the proposed method in improving phishing detection on social media.

查看原文本刊更多论文

使用选择性功能和机器学习来检测Twitter上的网络钓鱼犯罪的框架

社会意识信息技术（SIT）在帮助用户完成不同任务方面发挥着优先作用。社交媒体网络钓鱼是一种不断升级的网络安全威胁，攻击者利用欺骗手段窃取个人数据。实时网络钓鱼检测是至关重要的，并且高度依赖于最相关特征的选择。现有文献往往依赖于人工或随机特征选择，导致分类结果效率低下。本研究介绍了一种基于三种特征选择方法的混合机器学习网络钓鱼检测方法：Relief、Chi-square和额外的树分类器来确定最重要的特征。基于准确率、精密度、召回率、F1分数和曲线下面积（AUC）对Naïve贝叶斯、支持向量机、决策树、随机森林（RF）和逻辑回归等5种分类器进行了评估。实验结果表明，该模型的最高准确率为95.56%，AUC为99.00%，优于其他模型和前人的研究成果。结果证明了该方法在改进社交媒体网络钓鱼检测方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers & Electrical Engineering 工程技术-工程：电子与电气

CiteScore

9.20

自引率

7.00%

发文量

661

审稿时长

47 days

期刊介绍： The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency. Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.