Xinge Yan , Liukun He , Yifan Xu , Jiuxin Cao , Liangmin Wang , Guyang Xie
{"title":"High-speed encrypted traffic classification by using payload features","authors":"Xinge Yan , Liukun He , Yifan Xu , Jiuxin Cao , Liangmin Wang , Guyang Xie","doi":"10.1016/j.dcan.2024.02.003","DOIUrl":null,"url":null,"abstract":"<div><div>Traffic encryption techniques facilitate cyberattackers to hide their presence and activities. Traffic classification is an important method to prevent network threats. However, due to the tremendous traffic volume and limitations of computing, most existing traffic classification techniques are inapplicable to the high-speed network environment. In this paper, we propose a High-speed Encrypted Traffic Classification (HETC) method containing two stages. First, to efficiently detect whether traffic is encrypted, HETC focuses on randomly sampled short flows and extracts aggregation entropies with chi-square test features to measure the different patterns of the byte composition and distribution between encrypted and unencrypted flows. Second, HETC introduces binary features upon the previous features and performs fine-grained traffic classification by combining these payload features with a Random Forest model. The experimental results show that HETC can achieve a 94% F-measure in detecting encrypted flows and a 85%–93% F-measure in classifying fine-grained flows for a 1-KB flow-length dataset, outperforming the state-of-the-art comparison methods. Meanwhile, HETC does not need to wait for the end of the flow and can extract mass computing features. The average time for HETC to process each flow is only 2 or 16 ms, which is lower than the flow duration in most cases, making it a good candidate for high-speed traffic classification.</div></div>","PeriodicalId":48631,"journal":{"name":"Digital Communications and Networks","volume":"11 2","pages":"Pages 412-423"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Communications and Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352864824000208","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Traffic encryption techniques facilitate cyberattackers to hide their presence and activities. Traffic classification is an important method to prevent network threats. However, due to the tremendous traffic volume and limitations of computing, most existing traffic classification techniques are inapplicable to the high-speed network environment. In this paper, we propose a High-speed Encrypted Traffic Classification (HETC) method containing two stages. First, to efficiently detect whether traffic is encrypted, HETC focuses on randomly sampled short flows and extracts aggregation entropies with chi-square test features to measure the different patterns of the byte composition and distribution between encrypted and unencrypted flows. Second, HETC introduces binary features upon the previous features and performs fine-grained traffic classification by combining these payload features with a Random Forest model. The experimental results show that HETC can achieve a 94% F-measure in detecting encrypted flows and a 85%–93% F-measure in classifying fine-grained flows for a 1-KB flow-length dataset, outperforming the state-of-the-art comparison methods. Meanwhile, HETC does not need to wait for the end of the flow and can extract mass computing features. The average time for HETC to process each flow is only 2 or 16 ms, which is lower than the flow duration in most cases, making it a good candidate for high-speed traffic classification.
期刊介绍:
Digital Communications and Networks is a prestigious journal that emphasizes on communication systems and networks. We publish only top-notch original articles and authoritative reviews, which undergo rigorous peer-review. We are proud to announce that all our articles are fully Open Access and can be accessed on ScienceDirect. Our journal is recognized and indexed by eminent databases such as the Science Citation Index Expanded (SCIE) and Scopus.
In addition to regular articles, we may also consider exceptional conference papers that have been significantly expanded. Furthermore, we periodically release special issues that focus on specific aspects of the field.
In conclusion, Digital Communications and Networks is a leading journal that guarantees exceptional quality and accessibility for researchers and scholars in the field of communication systems and networks.