{"title":"让不平衡无处可藏:针对不平衡流量分类的类敏感特征提取","authors":"Yu Guo, Gaopeng Gou, G. Xiong, Minghao Jiang, Junzheng Shi, Wei Xia","doi":"10.1109/IJCNN52387.2021.9533821","DOIUrl":null,"url":null,"abstract":"With the full encryption of network traffic, traffic classification schemes based on machine learning emerge in endlessly. Class imbalance, as a widely-studied challenge in machine learning, has not attracted enough attention in traffic classification researches. The uneven distribution hidden in the real-world traffic will cause performance degradation of the existing schemes. In existing methods, data pre-sampling is easy to introduce noise or lose massive information; the cost matrix of cost-sensitive methods is difficult to design; feature selection methods will filter out lots of “redundant” features and cause unsatisfactory results. In this paper, we propose an effective end-to-end framework for imbalanced traffic classification which avoids the above weaknesses, called DeepFE. We adopt deep neural networks for feature extraction, and model features from the perspective of channels. It can learn class-sensitive feature representation, which is quite helpful to distinguish the minority traffic classes. Moreover, DeepFE can be applied to various tasks because of its unlimited input format, i.e., both the raw bytes and the packet length sequence can be used. We conducted experiments on the public dataset ISCXVPN2016 and a realworld traffic dataset covering 27 applications. The results show that DeepFE achieves excellent results, significantly alleviating the performance degradation caused by imbalance, and surpasses several state-of-the-art methods.","PeriodicalId":396583,"journal":{"name":"2021 International Joint Conference on Neural Networks (IJCNN)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Let Imbalance Have Nowhere to Hide: Class-Sensitive Feature Extraction for Imbalanced Traffic Classification\",\"authors\":\"Yu Guo, Gaopeng Gou, G. Xiong, Minghao Jiang, Junzheng Shi, Wei Xia\",\"doi\":\"10.1109/IJCNN52387.2021.9533821\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the full encryption of network traffic, traffic classification schemes based on machine learning emerge in endlessly. Class imbalance, as a widely-studied challenge in machine learning, has not attracted enough attention in traffic classification researches. The uneven distribution hidden in the real-world traffic will cause performance degradation of the existing schemes. In existing methods, data pre-sampling is easy to introduce noise or lose massive information; the cost matrix of cost-sensitive methods is difficult to design; feature selection methods will filter out lots of “redundant” features and cause unsatisfactory results. In this paper, we propose an effective end-to-end framework for imbalanced traffic classification which avoids the above weaknesses, called DeepFE. We adopt deep neural networks for feature extraction, and model features from the perspective of channels. It can learn class-sensitive feature representation, which is quite helpful to distinguish the minority traffic classes. Moreover, DeepFE can be applied to various tasks because of its unlimited input format, i.e., both the raw bytes and the packet length sequence can be used. We conducted experiments on the public dataset ISCXVPN2016 and a realworld traffic dataset covering 27 applications. The results show that DeepFE achieves excellent results, significantly alleviating the performance degradation caused by imbalance, and surpasses several state-of-the-art methods.\",\"PeriodicalId\":396583,\"journal\":{\"name\":\"2021 International Joint Conference on Neural Networks (IJCNN)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Joint Conference on Neural Networks (IJCNN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IJCNN52387.2021.9533821\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN52387.2021.9533821","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Let Imbalance Have Nowhere to Hide: Class-Sensitive Feature Extraction for Imbalanced Traffic Classification
With the full encryption of network traffic, traffic classification schemes based on machine learning emerge in endlessly. Class imbalance, as a widely-studied challenge in machine learning, has not attracted enough attention in traffic classification researches. The uneven distribution hidden in the real-world traffic will cause performance degradation of the existing schemes. In existing methods, data pre-sampling is easy to introduce noise or lose massive information; the cost matrix of cost-sensitive methods is difficult to design; feature selection methods will filter out lots of “redundant” features and cause unsatisfactory results. In this paper, we propose an effective end-to-end framework for imbalanced traffic classification which avoids the above weaknesses, called DeepFE. We adopt deep neural networks for feature extraction, and model features from the perspective of channels. It can learn class-sensitive feature representation, which is quite helpful to distinguish the minority traffic classes. Moreover, DeepFE can be applied to various tasks because of its unlimited input format, i.e., both the raw bytes and the packet length sequence can be used. We conducted experiments on the public dataset ISCXVPN2016 and a realworld traffic dataset covering 27 applications. The results show that DeepFE achieves excellent results, significantly alleviating the performance degradation caused by imbalance, and surpasses several state-of-the-art methods.