Xiaojuan Wang;Yu Zhang;Mingshu He;Shize Guo;Liu Yang
{"title":"利用集群压缩对网络流量进行有监督的表征学习","authors":"Xiaojuan Wang;Yu Zhang;Mingshu He;Shize Guo;Liu Yang","doi":"10.1109/TSUSC.2023.3292404","DOIUrl":null,"url":null,"abstract":"In the face of increasing network traffic, network security issues have gained significant attention. Existing network intrusion detection models often improve the ability to distinguish network behaviors by optimizing the model structure, while ignoring the expressiveness of network traffic at the data level. Visual analysis of network behavior through representation learning can provide a new perspective for network intrusion detection. Unfortunately, representation learning based on machine learning and deep learning often suffer from scalability and interpretability limitations. In this article, we establish an interpretable multi-layer mapping model to enhance the expressiveness of network traffic data. Moreover, the unsupervised method is used to extract the internal distribution characteristics of the data before the model to enhance the data. What’s more, we analyze the feasibility of the proposed flow spectrum theory on the UNSW-NB15 dataset. Experimental results demonstrate that the flow spectrum exhibits significant advantages in characterizing network behavior compared to the original network traffic features, underscoring its practical application value. Finally, we conduct an application analysis using multiple datasets (CICIDS2017 and CICIDS2018), revealing the model’s strong universality and adaptability across different datasets.","PeriodicalId":13268,"journal":{"name":"IEEE Transactions on Sustainable Computing","volume":"9 1","pages":"1-13"},"PeriodicalIF":3.0000,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Supervised Representation Learning for Network Traffic With Cluster Compression\",\"authors\":\"Xiaojuan Wang;Yu Zhang;Mingshu He;Shize Guo;Liu Yang\",\"doi\":\"10.1109/TSUSC.2023.3292404\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the face of increasing network traffic, network security issues have gained significant attention. Existing network intrusion detection models often improve the ability to distinguish network behaviors by optimizing the model structure, while ignoring the expressiveness of network traffic at the data level. Visual analysis of network behavior through representation learning can provide a new perspective for network intrusion detection. Unfortunately, representation learning based on machine learning and deep learning often suffer from scalability and interpretability limitations. In this article, we establish an interpretable multi-layer mapping model to enhance the expressiveness of network traffic data. Moreover, the unsupervised method is used to extract the internal distribution characteristics of the data before the model to enhance the data. What’s more, we analyze the feasibility of the proposed flow spectrum theory on the UNSW-NB15 dataset. Experimental results demonstrate that the flow spectrum exhibits significant advantages in characterizing network behavior compared to the original network traffic features, underscoring its practical application value. Finally, we conduct an application analysis using multiple datasets (CICIDS2017 and CICIDS2018), revealing the model’s strong universality and adaptability across different datasets.\",\"PeriodicalId\":13268,\"journal\":{\"name\":\"IEEE Transactions on Sustainable Computing\",\"volume\":\"9 1\",\"pages\":\"1-13\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2023-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Sustainable Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10233147/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Sustainable Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10233147/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Supervised Representation Learning for Network Traffic With Cluster Compression
In the face of increasing network traffic, network security issues have gained significant attention. Existing network intrusion detection models often improve the ability to distinguish network behaviors by optimizing the model structure, while ignoring the expressiveness of network traffic at the data level. Visual analysis of network behavior through representation learning can provide a new perspective for network intrusion detection. Unfortunately, representation learning based on machine learning and deep learning often suffer from scalability and interpretability limitations. In this article, we establish an interpretable multi-layer mapping model to enhance the expressiveness of network traffic data. Moreover, the unsupervised method is used to extract the internal distribution characteristics of the data before the model to enhance the data. What’s more, we analyze the feasibility of the proposed flow spectrum theory on the UNSW-NB15 dataset. Experimental results demonstrate that the flow spectrum exhibits significant advantages in characterizing network behavior compared to the original network traffic features, underscoring its practical application value. Finally, we conduct an application analysis using multiple datasets (CICIDS2017 and CICIDS2018), revealing the model’s strong universality and adaptability across different datasets.