基于图卷积网络的Web攻击有效载荷识别与可解释性分析

2022 18th International Conference on Mobility, Sensing and Networking (MSN) Pub Date : 2022-12-01 DOI:10.1109/MSN57253.2022.00071

Yijia Xu, Yong Fang, Zhonglin Liu

{"title":"基于图卷积网络的Web攻击有效载荷识别与可解释性分析","authors":"Yijia Xu, Yong Fang, Zhonglin Liu","doi":"10.1109/MSN57253.2022.00071","DOIUrl":null,"url":null,"abstract":"Web attack payload identification is a significant part of the Web defense system. The current Web attack payload identification usually combines natural language processing and deep learning to automatically build a detection model to intercept malicious payloads. However, these detection methods ignore the bidirectional association between fields and is prone to the payload dilution problem for long strings. In addition, the weak interpretability of deep learning models makes it difficult for researchers to solve the problem of model pollution and adjust the model according to the prediction logic. Therefore, this paper proposes a new Web attack payload identification method based on Graph Convolutional Network (GCN), which can effectively extract Web payload features and help model interpretability analysis. The core of this method is to transform the text feature problem into a graph feature extraction problem and to understand the structure and content of the Web payload from the graph perspective. The method performs node embedding on the Web payload graph through GCN, then converts the embedding vector into a graph feature vector through a feature fusion method. The node ablation method is used to analyze malicious payloads' interpretability and calculate the predicted impact rate of nodes inside the graph structure. The experiments on the CSIC 2010 v2 HTTP dataset show that the method proposed in this paper has high accuracy for identifying Web attack payloads, and the node embedding of the Relational Graph Convolutional Network (RGCN) method is more suitable for identifying Web attack payloads than other GCN methods. The research results of the paper show that the model interpretability analysis based on the Web payload graph is reasonable and can effectively assist researchers in adjusting the model and preventing the problem of model pollution.","PeriodicalId":114459,"journal":{"name":"2022 18th International Conference on Mobility, Sensing and Networking (MSN)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Web Attack Payload Identification and Interpretability Analysis Based on Graph Convolutional Network\",\"authors\":\"Yijia Xu, Yong Fang, Zhonglin Liu\",\"doi\":\"10.1109/MSN57253.2022.00071\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Web attack payload identification is a significant part of the Web defense system. The current Web attack payload identification usually combines natural language processing and deep learning to automatically build a detection model to intercept malicious payloads. However, these detection methods ignore the bidirectional association between fields and is prone to the payload dilution problem for long strings. In addition, the weak interpretability of deep learning models makes it difficult for researchers to solve the problem of model pollution and adjust the model according to the prediction logic. Therefore, this paper proposes a new Web attack payload identification method based on Graph Convolutional Network (GCN), which can effectively extract Web payload features and help model interpretability analysis. The core of this method is to transform the text feature problem into a graph feature extraction problem and to understand the structure and content of the Web payload from the graph perspective. The method performs node embedding on the Web payload graph through GCN, then converts the embedding vector into a graph feature vector through a feature fusion method. The node ablation method is used to analyze malicious payloads' interpretability and calculate the predicted impact rate of nodes inside the graph structure. The experiments on the CSIC 2010 v2 HTTP dataset show that the method proposed in this paper has high accuracy for identifying Web attack payloads, and the node embedding of the Relational Graph Convolutional Network (RGCN) method is more suitable for identifying Web attack payloads than other GCN methods. The research results of the paper show that the model interpretability analysis based on the Web payload graph is reasonable and can effectively assist researchers in adjusting the model and preventing the problem of model pollution.\",\"PeriodicalId\":114459,\"journal\":{\"name\":\"2022 18th International Conference on Mobility, Sensing and Networking (MSN)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 18th International Conference on Mobility, Sensing and Networking (MSN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MSN57253.2022.00071\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 18th International Conference on Mobility, Sensing and Networking (MSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSN57253.2022.00071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

Web攻击有效载荷识别是Web防御系统的重要组成部分。目前的Web攻击有效载荷识别通常将自然语言处理和深度学习相结合，自动构建检测模型来拦截恶意有效载荷。然而，这些检测方法忽略了场之间的双向关联，并且容易出现长串有效载荷稀释问题。此外，深度学习模型的可解释性较弱，使得研究人员难以解决模型污染问题，也难以根据预测逻辑调整模型。因此，本文提出了一种新的基于图卷积网络(GCN)的Web攻击有效载荷识别方法，该方法可以有效地提取Web有效载荷特征，并有助于模型可解释性分析。该方法的核心是将文本特征问题转化为图形特征提取问题，从图的角度理解Web负载的结构和内容。该方法通过GCN对Web负载图进行节点嵌入，然后通过特征融合方法将嵌入向量转换为图形特征向量。采用节点消融法分析恶意载荷的可解释性，计算图结构内节点的预测影响率。在CSIC 2010 v2 HTTP数据集上的实验表明，本文提出的方法对Web攻击有效载荷的识别准确率较高，RGCN方法的节点嵌入比其他GCN方法更适合于Web攻击有效载荷的识别。研究结果表明，基于Web负载图的模型可解释性分析是合理的，可以有效地帮助研究人员调整模型，防止模型污染问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Web Attack Payload Identification and Interpretability Analysis Based on Graph Convolutional Network

Web attack payload identification is a significant part of the Web defense system. The current Web attack payload identification usually combines natural language processing and deep learning to automatically build a detection model to intercept malicious payloads. However, these detection methods ignore the bidirectional association between fields and is prone to the payload dilution problem for long strings. In addition, the weak interpretability of deep learning models makes it difficult for researchers to solve the problem of model pollution and adjust the model according to the prediction logic. Therefore, this paper proposes a new Web attack payload identification method based on Graph Convolutional Network (GCN), which can effectively extract Web payload features and help model interpretability analysis. The core of this method is to transform the text feature problem into a graph feature extraction problem and to understand the structure and content of the Web payload from the graph perspective. The method performs node embedding on the Web payload graph through GCN, then converts the embedding vector into a graph feature vector through a feature fusion method. The node ablation method is used to analyze malicious payloads' interpretability and calculate the predicted impact rate of nodes inside the graph structure. The experiments on the CSIC 2010 v2 HTTP dataset show that the method proposed in this paper has high accuracy for identifying Web attack payloads, and the node embedding of the Relational Graph Convolutional Network (RGCN) method is more suitable for identifying Web attack payloads than other GCN methods. The research results of the paper show that the model interpretability analysis based on the Web payload graph is reasonable and can effectively assist researchers in adjusting the model and preventing the problem of model pollution.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 18th International Conference on Mobility, Sensing and Networking (MSN)

自引率

0.00%

发文量