让不平衡无处可藏:针对不平衡流量分类的类敏感特征提取

2021 International Joint Conference on Neural Networks (IJCNN) Pub Date : 2021-07-18 DOI:10.1109/IJCNN52387.2021.9533821

Yu Guo, Gaopeng Gou, G. Xiong, Minghao Jiang, Junzheng Shi, Wei Xia

{"title":"让不平衡无处可藏:针对不平衡流量分类的类敏感特征提取","authors":"Yu Guo, Gaopeng Gou, G. Xiong, Minghao Jiang, Junzheng Shi, Wei Xia","doi":"10.1109/IJCNN52387.2021.9533821","DOIUrl":null,"url":null,"abstract":"With the full encryption of network traffic, traffic classification schemes based on machine learning emerge in endlessly. Class imbalance, as a widely-studied challenge in machine learning, has not attracted enough attention in traffic classification researches. The uneven distribution hidden in the real-world traffic will cause performance degradation of the existing schemes. In existing methods, data pre-sampling is easy to introduce noise or lose massive information; the cost matrix of cost-sensitive methods is difficult to design; feature selection methods will filter out lots of “redundant” features and cause unsatisfactory results. In this paper, we propose an effective end-to-end framework for imbalanced traffic classification which avoids the above weaknesses, called DeepFE. We adopt deep neural networks for feature extraction, and model features from the perspective of channels. It can learn class-sensitive feature representation, which is quite helpful to distinguish the minority traffic classes. Moreover, DeepFE can be applied to various tasks because of its unlimited input format, i.e., both the raw bytes and the packet length sequence can be used. We conducted experiments on the public dataset ISCXVPN2016 and a realworld traffic dataset covering 27 applications. The results show that DeepFE achieves excellent results, significantly alleviating the performance degradation caused by imbalance, and surpasses several state-of-the-art methods.","PeriodicalId":396583,"journal":{"name":"2021 International Joint Conference on Neural Networks (IJCNN)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Let Imbalance Have Nowhere to Hide: Class-Sensitive Feature Extraction for Imbalanced Traffic Classification\",\"authors\":\"Yu Guo, Gaopeng Gou, G. Xiong, Minghao Jiang, Junzheng Shi, Wei Xia\",\"doi\":\"10.1109/IJCNN52387.2021.9533821\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the full encryption of network traffic, traffic classification schemes based on machine learning emerge in endlessly. Class imbalance, as a widely-studied challenge in machine learning, has not attracted enough attention in traffic classification researches. The uneven distribution hidden in the real-world traffic will cause performance degradation of the existing schemes. In existing methods, data pre-sampling is easy to introduce noise or lose massive information; the cost matrix of cost-sensitive methods is difficult to design; feature selection methods will filter out lots of “redundant” features and cause unsatisfactory results. In this paper, we propose an effective end-to-end framework for imbalanced traffic classification which avoids the above weaknesses, called DeepFE. We adopt deep neural networks for feature extraction, and model features from the perspective of channels. It can learn class-sensitive feature representation, which is quite helpful to distinguish the minority traffic classes. Moreover, DeepFE can be applied to various tasks because of its unlimited input format, i.e., both the raw bytes and the packet length sequence can be used. We conducted experiments on the public dataset ISCXVPN2016 and a realworld traffic dataset covering 27 applications. The results show that DeepFE achieves excellent results, significantly alleviating the performance degradation caused by imbalance, and surpasses several state-of-the-art methods.\",\"PeriodicalId\":396583,\"journal\":{\"name\":\"2021 International Joint Conference on Neural Networks (IJCNN)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Joint Conference on Neural Networks (IJCNN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IJCNN52387.2021.9533821\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN52387.2021.9533821","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

随着网络流量的完全加密，基于机器学习的流量分类方案层出不穷。类不平衡作为机器学习中一个被广泛研究的难题，在流量分类研究中还没有引起足够的重视。隐藏在现实通信量中的不均匀分布会导致现有方案的性能下降。在现有方法中，数据预采样容易引入噪声或丢失大量信息;成本敏感法的成本矩阵设计困难;特征选择方法会过滤掉大量的“冗余”特征，从而导致不满意的结果。在本文中，我们提出了一种有效的端到端不平衡流分类框架，该框架避免了上述缺点，称为DeepFE。我们采用深度神经网络进行特征提取，从通道的角度对特征进行建模。它可以学习类敏感的特征表示，这对区分少数流量类很有帮助。此外，DeepFE可以应用于各种任务，因为它的输入格式是无限的，即原始字节和数据包长度序列都可以使用。我们在公共数据集ISCXVPN2016和包含27个应用的真实流量数据集上进行了实验。结果表明，DeepFE取得了优异的效果，显著缓解了不平衡导致的性能下降，并超过了几种最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Let Imbalance Have Nowhere to Hide: Class-Sensitive Feature Extraction for Imbalanced Traffic Classification

With the full encryption of network traffic, traffic classification schemes based on machine learning emerge in endlessly. Class imbalance, as a widely-studied challenge in machine learning, has not attracted enough attention in traffic classification researches. The uneven distribution hidden in the real-world traffic will cause performance degradation of the existing schemes. In existing methods, data pre-sampling is easy to introduce noise or lose massive information; the cost matrix of cost-sensitive methods is difficult to design; feature selection methods will filter out lots of “redundant” features and cause unsatisfactory results. In this paper, we propose an effective end-to-end framework for imbalanced traffic classification which avoids the above weaknesses, called DeepFE. We adopt deep neural networks for feature extraction, and model features from the perspective of channels. It can learn class-sensitive feature representation, which is quite helpful to distinguish the minority traffic classes. Moreover, DeepFE can be applied to various tasks because of its unlimited input format, i.e., both the raw bytes and the packet length sequence can be used. We conducted experiments on the public dataset ISCXVPN2016 and a realworld traffic dataset covering 27 applications. The results show that DeepFE achieves excellent results, significantly alleviating the performance degradation caused by imbalance, and surpasses several state-of-the-art methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 International Joint Conference on Neural Networks (IJCNN)

自引率

0.00%

发文量