Robust Detection of Malicious Encrypted Traffic via Contrastive Learning

IF 6.3 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

IEEE Transactions on Information Forensics and Security Pub Date : 2025-04-15 DOI:10.1109/TIFS.2025.3560560

Meng Shen;Jinhe Wu;Ke Ye;Ke Xu;Gang Xiong;Liehuang Zhu

{"title":"Robust Detection of Malicious Encrypted Traffic via Contrastive Learning","authors":"Meng Shen;Jinhe Wu;Ke Ye;Ke Xu;Gang Xiong;Liehuang Zhu","doi":"10.1109/TIFS.2025.3560560","DOIUrl":null,"url":null,"abstract":"Traffic encryption is widely used to protect communication privacy but is increasingly exploited by attackers to conceal malicious activities. Existing malicious encrypted traffic detection methods rely on large amounts of labeled samples for training, limiting their ability to quickly respond to new attacks. These methods also are vulnerable to traffic obfuscation strategies, such as injecting dummy packets. In this paper, we propose SmartDetector, a robust malicious encrypted traffic detection method via contrastive learning. We first propose a novel traffic representation named Semantic Attribute Matrix (SAM), which can effectively distinguish between malicious and benign traffic. We also design a data augmentation method to generate diverse traffic samples, which makes the detection model more robust against different traffic obfuscation strategies. We propose a malicious encrypted traffic classifier that first pre-trains a model via contrastive learning to learn deep representations from unlabeled data, then fine-tunes the model with a supervised classifier to achieve accurate detection even with only a few labeled samples. We conduct extensive experiments with five public datasets to evaluate the performance of SmartDetector. The results demonstrate that it outperforms the state-of-the-art (SOTA) methods in three typical scenarios. Specifically, in the evasion attack detection scenario, SmartDetector achieves an F1 score and AUC above 93%, with average improvements of 19.84% and 18.17% over the SOTA method, respectively.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"4228-4242"},"PeriodicalIF":6.3000,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10964328/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Traffic encryption is widely used to protect communication privacy but is increasingly exploited by attackers to conceal malicious activities. Existing malicious encrypted traffic detection methods rely on large amounts of labeled samples for training, limiting their ability to quickly respond to new attacks. These methods also are vulnerable to traffic obfuscation strategies, such as injecting dummy packets. In this paper, we propose SmartDetector, a robust malicious encrypted traffic detection method via contrastive learning. We first propose a novel traffic representation named Semantic Attribute Matrix (SAM), which can effectively distinguish between malicious and benign traffic. We also design a data augmentation method to generate diverse traffic samples, which makes the detection model more robust against different traffic obfuscation strategies. We propose a malicious encrypted traffic classifier that first pre-trains a model via contrastive learning to learn deep representations from unlabeled data, then fine-tunes the model with a supervised classifier to achieve accurate detection even with only a few labeled samples. We conduct extensive experiments with five public datasets to evaluate the performance of SmartDetector. The results demonstrate that it outperforms the state-of-the-art (SOTA) methods in three typical scenarios. Specifically, in the evasion attack detection scenario, SmartDetector achieves an F1 score and AUC above 93%, with average improvements of 19.84% and 18.17% over the SOTA method, respectively.

查看原文本刊更多论文

基于对比学习的恶意加密流量鲁棒检测

流量加密被广泛用于保护通信隐私，但越来越多的攻击者利用流量加密来隐藏恶意活动。现有的恶意加密流量检测方法依赖于大量的标记样本进行训练，限制了它们快速响应新攻击的能力。这些方法也容易受到流量混淆策略的攻击，例如注入虚拟数据包。在本文中，我们提出了一种基于对比学习的鲁棒恶意加密流量检测方法SmartDetector。首先，我们提出了一种新的流量表示方法——语义属性矩阵（SAM），它可以有效地区分恶意流量和良性流量。我们还设计了一种数据增强方法来生成不同的流量样本，使检测模型对不同的流量混淆策略具有更强的鲁棒性。我们提出了一种恶意加密流量分类器，该分类器首先通过对比学习对模型进行预训练，从未标记的数据中学习深度表征，然后使用监督分类器对模型进行微调，即使只有少量标记样本也能实现准确检测。我们对五个公共数据集进行了广泛的实验，以评估SmartDetector的性能。结果表明，在三个典型场景中，该方法优于最先进的（SOTA）方法。其中，在逃避攻击检测场景下，SmartDetector的F1得分和AUC均在93%以上，比SOTA方法平均提高19.84%和18.17%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Information Forensics and Security 工程技术-工程：电子与电气

CiteScore

14.40

自引率

7.40%

发文量

234

审稿时长

6.5 months

期刊介绍： The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features