{"title":"基于图卷积网络的Web攻击有效载荷识别与可解释性分析","authors":"Yijia Xu, Yong Fang, Zhonglin Liu","doi":"10.1109/MSN57253.2022.00071","DOIUrl":null,"url":null,"abstract":"Web attack payload identification is a significant part of the Web defense system. The current Web attack payload identification usually combines natural language processing and deep learning to automatically build a detection model to intercept malicious payloads. However, these detection methods ignore the bidirectional association between fields and is prone to the payload dilution problem for long strings. In addition, the weak interpretability of deep learning models makes it difficult for researchers to solve the problem of model pollution and adjust the model according to the prediction logic. Therefore, this paper proposes a new Web attack payload identification method based on Graph Convolutional Network (GCN), which can effectively extract Web payload features and help model interpretability analysis. The core of this method is to transform the text feature problem into a graph feature extraction problem and to understand the structure and content of the Web payload from the graph perspective. The method performs node embedding on the Web payload graph through GCN, then converts the embedding vector into a graph feature vector through a feature fusion method. The node ablation method is used to analyze malicious payloads' interpretability and calculate the predicted impact rate of nodes inside the graph structure. The experiments on the CSIC 2010 v2 HTTP dataset show that the method proposed in this paper has high accuracy for identifying Web attack payloads, and the node embedding of the Relational Graph Convolutional Network (RGCN) method is more suitable for identifying Web attack payloads than other GCN methods. The research results of the paper show that the model interpretability analysis based on the Web payload graph is reasonable and can effectively assist researchers in adjusting the model and preventing the problem of model pollution.","PeriodicalId":114459,"journal":{"name":"2022 18th International Conference on Mobility, Sensing and Networking (MSN)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Web Attack Payload Identification and Interpretability Analysis Based on Graph Convolutional Network\",\"authors\":\"Yijia Xu, Yong Fang, Zhonglin Liu\",\"doi\":\"10.1109/MSN57253.2022.00071\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Web attack payload identification is a significant part of the Web defense system. The current Web attack payload identification usually combines natural language processing and deep learning to automatically build a detection model to intercept malicious payloads. However, these detection methods ignore the bidirectional association between fields and is prone to the payload dilution problem for long strings. In addition, the weak interpretability of deep learning models makes it difficult for researchers to solve the problem of model pollution and adjust the model according to the prediction logic. Therefore, this paper proposes a new Web attack payload identification method based on Graph Convolutional Network (GCN), which can effectively extract Web payload features and help model interpretability analysis. The core of this method is to transform the text feature problem into a graph feature extraction problem and to understand the structure and content of the Web payload from the graph perspective. The method performs node embedding on the Web payload graph through GCN, then converts the embedding vector into a graph feature vector through a feature fusion method. The node ablation method is used to analyze malicious payloads' interpretability and calculate the predicted impact rate of nodes inside the graph structure. The experiments on the CSIC 2010 v2 HTTP dataset show that the method proposed in this paper has high accuracy for identifying Web attack payloads, and the node embedding of the Relational Graph Convolutional Network (RGCN) method is more suitable for identifying Web attack payloads than other GCN methods. The research results of the paper show that the model interpretability analysis based on the Web payload graph is reasonable and can effectively assist researchers in adjusting the model and preventing the problem of model pollution.\",\"PeriodicalId\":114459,\"journal\":{\"name\":\"2022 18th International Conference on Mobility, Sensing and Networking (MSN)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 18th International Conference on Mobility, Sensing and Networking (MSN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MSN57253.2022.00071\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 18th International Conference on Mobility, Sensing and Networking (MSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSN57253.2022.00071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Web Attack Payload Identification and Interpretability Analysis Based on Graph Convolutional Network
Web attack payload identification is a significant part of the Web defense system. The current Web attack payload identification usually combines natural language processing and deep learning to automatically build a detection model to intercept malicious payloads. However, these detection methods ignore the bidirectional association between fields and is prone to the payload dilution problem for long strings. In addition, the weak interpretability of deep learning models makes it difficult for researchers to solve the problem of model pollution and adjust the model according to the prediction logic. Therefore, this paper proposes a new Web attack payload identification method based on Graph Convolutional Network (GCN), which can effectively extract Web payload features and help model interpretability analysis. The core of this method is to transform the text feature problem into a graph feature extraction problem and to understand the structure and content of the Web payload from the graph perspective. The method performs node embedding on the Web payload graph through GCN, then converts the embedding vector into a graph feature vector through a feature fusion method. The node ablation method is used to analyze malicious payloads' interpretability and calculate the predicted impact rate of nodes inside the graph structure. The experiments on the CSIC 2010 v2 HTTP dataset show that the method proposed in this paper has high accuracy for identifying Web attack payloads, and the node embedding of the Relational Graph Convolutional Network (RGCN) method is more suitable for identifying Web attack payloads than other GCN methods. The research results of the paper show that the model interpretability analysis based on the Web payload graph is reasonable and can effectively assist researchers in adjusting the model and preventing the problem of model pollution.