基于修辞结构理论的机器智能驱动欺骗性网络钓鱼攻击检测方案

IF 3.7 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Information Security and Applications Pub Date : 2025-08-20 DOI:10.1016/j.jisa.2025.104184

Chanchal Patra , Debasis Giri , Bibekananda Kundu , Tanmoy Maitra , Mohammad Wazid

{"title":"基于修辞结构理论的机器智能驱动欺骗性网络钓鱼攻击检测方案","authors":"Chanchal Patra , Debasis Giri , Bibekananda Kundu , Tanmoy Maitra , Mohammad Wazid","doi":"10.1016/j.jisa.2025.104184","DOIUrl":null,"url":null,"abstract":"<div><div>The easiest way for users to interact with one other is via emails or messages. However, the growing incidence of cybercrime necessitates the astute use of emails or messages. These days, one of the biggest risks is phishing as well as smishing. Attackers aim to get sensitive user data by means of phishing emails. Credit card information, passwords, usernames, and other sensitive data are included. These might result in severe financial loss. The literature has a plethora of anti-phishing techniques for identifying phishing email or messages. However, fraudsters are always coming up with new techniques, making it difficult to develop anti-phishing techniques to stop phishing or smishing attack. This paper discusses a novel methodology leveraging Rhetorical Structure Theory (RST) to validate whether a given text of emails or messages are deceptive or not. A balanced dataset of deceptive and non-deceptive have been collected and annotated manually using different features like term Discourse Connectors, Rhetorical Relations, Deception likely tags and sentence type features. The work involved experiment with different machine learning classifiers trained using these features in order to achieve higher accuracy in deception phishing detection task. The proposed technique exhibits high accuracy on the dataset when RST based linguistic features are used. When ensemble classifiers are used instead of individual classifiers, the optimal classification performance is achieved, leading to an increase in accuracy. In comparison to the individual learners, the results of our experiment demonstrate that the proposed technique achieved the greatest accuracy, precision, recall, and F1-score values.</div></div>","PeriodicalId":48638,"journal":{"name":"Journal of Information Security and Applications","volume":"94 ","pages":"Article 104184"},"PeriodicalIF":3.7000,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Rhetorical Structure Theory-based machine intelligence-driven deceptive phishing attack detection scheme\",\"authors\":\"Chanchal Patra , Debasis Giri , Bibekananda Kundu , Tanmoy Maitra , Mohammad Wazid\",\"doi\":\"10.1016/j.jisa.2025.104184\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The easiest way for users to interact with one other is via emails or messages. However, the growing incidence of cybercrime necessitates the astute use of emails or messages. These days, one of the biggest risks is phishing as well as smishing. Attackers aim to get sensitive user data by means of phishing emails. Credit card information, passwords, usernames, and other sensitive data are included. These might result in severe financial loss. The literature has a plethora of anti-phishing techniques for identifying phishing email or messages. However, fraudsters are always coming up with new techniques, making it difficult to develop anti-phishing techniques to stop phishing or smishing attack. This paper discusses a novel methodology leveraging Rhetorical Structure Theory (RST) to validate whether a given text of emails or messages are deceptive or not. A balanced dataset of deceptive and non-deceptive have been collected and annotated manually using different features like term Discourse Connectors, Rhetorical Relations, Deception likely tags and sentence type features. The work involved experiment with different machine learning classifiers trained using these features in order to achieve higher accuracy in deception phishing detection task. The proposed technique exhibits high accuracy on the dataset when RST based linguistic features are used. When ensemble classifiers are used instead of individual classifiers, the optimal classification performance is achieved, leading to an increase in accuracy. In comparison to the individual learners, the results of our experiment demonstrate that the proposed technique achieved the greatest accuracy, precision, recall, and F1-score values.</div></div>\",\"PeriodicalId\":48638,\"journal\":{\"name\":\"Journal of Information Security and Applications\",\"volume\":\"94 \",\"pages\":\"Article 104184\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Information Security and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2214212625002212\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Security and Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214212625002212","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

用户之间最简单的互动方式是通过电子邮件或消息。然而，越来越多的网络犯罪事件要求人们精明地使用电子邮件或信息。如今，最大的风险之一是网络钓鱼和诈骗。攻击者的目标是通过网络钓鱼邮件获取敏感的用户数据。包括信用卡信息、密码、用户名和其他敏感数据。这些可能导致严重的经济损失。文献中有大量用于识别网络钓鱼电子邮件或消息的反网络钓鱼技术。然而，欺诈者总是想出新的技术，这使得开发反网络钓鱼技术来阻止网络钓鱼或诈骗攻击变得困难。本文讨论了一种利用修辞结构理论（RST）来验证电子邮件或信息文本是否具有欺骗性的新方法。我们收集了一个平衡的欺骗性和非欺骗性数据集，并使用不同的特征（如术语语篇连接词、修辞关系、欺骗可能性标签和句子类型特征）手工注释。该工作涉及使用这些特征训练不同的机器学习分类器进行实验，以便在欺骗网络钓鱼检测任务中达到更高的准确性。当使用基于RST的语言特征时，所提出的技术在数据集上显示出较高的准确性。当使用集成分类器代替单个分类器时，可以获得最佳的分类性能，从而提高准确率。与个体学习者相比，我们的实验结果表明，所提出的技术达到了最高的准确性、精密度、召回率和f1分数值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Rhetorical Structure Theory-based machine intelligence-driven deceptive phishing attack detection scheme

The easiest way for users to interact with one other is via emails or messages. However, the growing incidence of cybercrime necessitates the astute use of emails or messages. These days, one of the biggest risks is phishing as well as smishing. Attackers aim to get sensitive user data by means of phishing emails. Credit card information, passwords, usernames, and other sensitive data are included. These might result in severe financial loss. The literature has a plethora of anti-phishing techniques for identifying phishing email or messages. However, fraudsters are always coming up with new techniques, making it difficult to develop anti-phishing techniques to stop phishing or smishing attack. This paper discusses a novel methodology leveraging Rhetorical Structure Theory (RST) to validate whether a given text of emails or messages are deceptive or not. A balanced dataset of deceptive and non-deceptive have been collected and annotated manually using different features like term Discourse Connectors, Rhetorical Relations, Deception likely tags and sentence type features. The work involved experiment with different machine learning classifiers trained using these features in order to achieve higher accuracy in deception phishing detection task. The proposed technique exhibits high accuracy on the dataset when RST based linguistic features are used. When ensemble classifiers are used instead of individual classifiers, the optimal classification performance is achieved, leading to an increase in accuracy. In comparison to the individual learners, the results of our experiment demonstrate that the proposed technique achieved the greatest accuracy, precision, recall, and F1-score values.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Information Security and Applications Computer Science-Computer Networks and Communications

CiteScore

10.90

自引率

5.40%

发文量

206

审稿时长

56 days

期刊介绍： Journal of Information Security and Applications (JISA) focuses on the original research and practice-driven applications with relevance to information security and applications. JISA provides a common linkage between a vibrant scientific and research community and industry professionals by offering a clear view on modern problems and challenges in information security, as well as identifying promising scientific and "best-practice" solutions. JISA issues offer a balance between original research work and innovative industrial approaches by internationally renowned information security experts and researchers.