Machine Learning-Driven Detection of Cross-Site Scripting Attacks

Information Pub Date : 2024-07-20 DOI:10.3390/info15070420

Rahmah Alhamyani, Majid Alshammari

{"title":"Machine Learning-Driven Detection of Cross-Site Scripting Attacks","authors":"Rahmah Alhamyani, Majid Alshammari","doi":"10.3390/info15070420","DOIUrl":null,"url":null,"abstract":"The ever-growing web application landscape, fueled by technological advancements, introduces new vulnerabilities to cyberattacks. Cross-site scripting (XSS) attacks pose a significant threat, exploiting the difficulty of distinguishing between benign and malicious scripts within web applications. Traditional detection methods struggle with high false-positive (FP) and false-negative (FN) rates. This research proposes a novel machine learning (ML)-based approach for robust XSS attack detection. We evaluate various models including Random Forest (RF), Logistic Regression (LR), Support Vector Machines (SVMs), Decision Trees (DTs), Extreme Gradient Boosting (XGBoost), Multi-Layer Perceptron (MLP), Convolutional Neural Networks (CNNs), Artificial Neural Networks (ANNs), and ensemble learning. The models are trained on a real-world dataset categorized into benign and malicious traffic, incorporating feature selection methods like Information Gain (IG) and Analysis of Variance (ANOVA) for optimal performance. Our findings reveal exceptional accuracy, with the RF model achieving 99.78% and ensemble models exceeding 99.64%. These results surpass existing methods, demonstrating the effectiveness of the proposed approach in securing web applications while minimizing FPs and FNs. This research offers a significant contribution to the field of web application security by providing a highly accurate and robust ML-based solution for XSS attack detection.","PeriodicalId":510156,"journal":{"name":"Information","volume":"119 50","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/info15070420","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The ever-growing web application landscape, fueled by technological advancements, introduces new vulnerabilities to cyberattacks. Cross-site scripting (XSS) attacks pose a significant threat, exploiting the difficulty of distinguishing between benign and malicious scripts within web applications. Traditional detection methods struggle with high false-positive (FP) and false-negative (FN) rates. This research proposes a novel machine learning (ML)-based approach for robust XSS attack detection. We evaluate various models including Random Forest (RF), Logistic Regression (LR), Support Vector Machines (SVMs), Decision Trees (DTs), Extreme Gradient Boosting (XGBoost), Multi-Layer Perceptron (MLP), Convolutional Neural Networks (CNNs), Artificial Neural Networks (ANNs), and ensemble learning. The models are trained on a real-world dataset categorized into benign and malicious traffic, incorporating feature selection methods like Information Gain (IG) and Analysis of Variance (ANOVA) for optimal performance. Our findings reveal exceptional accuracy, with the RF model achieving 99.78% and ensemble models exceeding 99.64%. These results surpass existing methods, demonstrating the effectiveness of the proposed approach in securing web applications while minimizing FPs and FNs. This research offers a significant contribution to the field of web application security by providing a highly accurate and robust ML-based solution for XSS attack detection.

查看原文本刊更多论文

机器学习驱动的跨站脚本攻击检测

在技术进步的推动下，网络应用不断发展，为网络攻击带来了新的漏洞。跨站脚本 (XSS) 攻击利用了网络应用程序中难以区分良性脚本和恶意脚本的弱点，构成了重大威胁。传统的检测方法存在较高的假阳性（FP）和假阴性（FN）率。本研究提出了一种基于机器学习（ML）的新方法，用于稳健的 XSS 攻击检测。我们评估了各种模型，包括随机森林 (RF)、逻辑回归 (LR)、支持向量机 (SVM)、决策树 (DT)、极梯度提升 (XGBoost)、多层感知器 (MLP)、卷积神经网络 (CNN)、人工神经网络 (ANN) 和集合学习。这些模型在真实世界的数据集上进行训练，分为良性流量和恶意流量，并结合了信息增益（IG）和方差分析（ANOVA）等特征选择方法，以获得最佳性能。我们的研究结果表明，RF 模型的准确率达到 99.78%，集合模型的准确率超过 99.64%。这些结果超越了现有的方法，证明了所提出的方法在确保网络应用安全的同时最大限度地减少 FP 和 FN 方面的有效性。这项研究为网络应用程序安全领域做出了重大贡献，为 XSS 攻击检测提供了一种高度准确和稳健的基于 ML 的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information

自引率

0.00%

发文量