{"title":"Topic-aware neural attention network for malicious social media spam detection","authors":"Maged Nasser , Faisal Saeed , Aminu Da’u , Abdulaziz Alblwi , Mohammed Al-Sarem","doi":"10.1016/j.aej.2024.10.073","DOIUrl":null,"url":null,"abstract":"<div><div>Social media platforms, such as Facebook and X (formally known as Twitter), have become indispensable tools in today's society because they facilitate social discussion and information sharing. This feature makes social networks more attractive for spammers who intentionally spread fake messages, post malicious links and spread rumours. Recently, several machine learning methods have been introduced for social network malicious spam classification. However, most existing methods generally rely on handcrafted features and traditional embedding models, which are relatively less effective. Therefore, inspired by the success of the neural attention network, we propose an interactive neural attention-based method for malicious spam detection by integrating long short-term memory (LSTM), topic modelling, and the BERT technique. In the proposed approach, first, we employed the LSTM encoder, which was integrated with the Twitter latent Dirichlet allocation (LDA) model via an interactive attention mechanism to jointly learn local content and global topic representations. Second, to further learn the contextualized features of texts, the model was further integrated with the BERT technique. Last, the Softmax function was then applied at the output layer for the final spam classification. A series of experiments were conducted utilizing two real-world datasets to evaluate the model. Using dataset 1, the proposed model outperformed the baseline techniques, with average improvements in recall, precision, and F1 and accuracies of 17.54 %, 6.19 %, 11.91 %, and 12.27 %, respectively. In addition, the proposed model performed well for the second dataset and obtained average gains of 11.81 %, 4.38 %, 8.12, and 7.42 in terms of recall, precision, F1, and accuracy, respectively.</div></div>","PeriodicalId":7484,"journal":{"name":"alexandria engineering journal","volume":"111 ","pages":"Pages 540-554"},"PeriodicalIF":6.2000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"alexandria engineering journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110016824012389","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Social media platforms, such as Facebook and X (formally known as Twitter), have become indispensable tools in today's society because they facilitate social discussion and information sharing. This feature makes social networks more attractive for spammers who intentionally spread fake messages, post malicious links and spread rumours. Recently, several machine learning methods have been introduced for social network malicious spam classification. However, most existing methods generally rely on handcrafted features and traditional embedding models, which are relatively less effective. Therefore, inspired by the success of the neural attention network, we propose an interactive neural attention-based method for malicious spam detection by integrating long short-term memory (LSTM), topic modelling, and the BERT technique. In the proposed approach, first, we employed the LSTM encoder, which was integrated with the Twitter latent Dirichlet allocation (LDA) model via an interactive attention mechanism to jointly learn local content and global topic representations. Second, to further learn the contextualized features of texts, the model was further integrated with the BERT technique. Last, the Softmax function was then applied at the output layer for the final spam classification. A series of experiments were conducted utilizing two real-world datasets to evaluate the model. Using dataset 1, the proposed model outperformed the baseline techniques, with average improvements in recall, precision, and F1 and accuracies of 17.54 %, 6.19 %, 11.91 %, and 12.27 %, respectively. In addition, the proposed model performed well for the second dataset and obtained average gains of 11.81 %, 4.38 %, 8.12, and 7.42 in terms of recall, precision, F1, and accuracy, respectively.
期刊介绍:
Alexandria Engineering Journal is an international journal devoted to publishing high quality papers in the field of engineering and applied science. Alexandria Engineering Journal is cited in the Engineering Information Services (EIS) and the Chemical Abstracts (CA). The papers published in Alexandria Engineering Journal are grouped into five sections, according to the following classification:
• Mechanical, Production, Marine and Textile Engineering
• Electrical Engineering, Computer Science and Nuclear Engineering
• Civil and Architecture Engineering
• Chemical Engineering and Applied Sciences
• Environmental Engineering