FLAG: Flow Representation Generator based on Self-supervised Learning for Encrypted Traffic Classification

Wenting Wei, Tianjie Ju, Han Liao, Weike Zhao, Huaxi Gu
{"title":"FLAG: Flow Representation Generator based on Self-supervised Learning for Encrypted Traffic Classification","authors":"Wenting Wei, Tianjie Ju, Han Liao, Weike Zhao, Huaxi Gu","doi":"10.1145/3469393.3469394","DOIUrl":null,"url":null,"abstract":"Due to its excellent ability in learning features from large scale raw data, deep learning (DL) has attracted much attention for encrypted traffic classification. However, most DL-based traffic classifiers usually rely on enormous labeled samples. Motivated by this, we investigate a self-supervised traffic classifier (FLAG) without sacrifice of identification accuracy, only depending on small labeled traffic samples and highly available unlabeled traffic samples. Specifically, focusing on local short-term characteristics of traffic, we design a preprocessing algorithm, termed as N-phrase Extration, to convert unlabeled raw traffic dataset into sequences of high-frequency phrases as input of Bidirectional Encoder. On account of their significance, potential timing characteristics from input sequences are mined by Bidirectional Encoder and embedded into robust representations with distributed vectors to enhance classifier’s performance significantly. Our comprehensive experiments indicate FLAG can achieve 98.65% in 100% of dataset and 98.07% in 10% of dataset in terms of true positive rate in UNB ISCX VPN-nonVPN dataset, which are better than p-FP, FS-Net and Deep Packet.","PeriodicalId":291942,"journal":{"name":"5th Asia-Pacific Workshop on Networking (APNet 2021)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"5th Asia-Pacific Workshop on Networking (APNet 2021)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3469393.3469394","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Due to its excellent ability in learning features from large scale raw data, deep learning (DL) has attracted much attention for encrypted traffic classification. However, most DL-based traffic classifiers usually rely on enormous labeled samples. Motivated by this, we investigate a self-supervised traffic classifier (FLAG) without sacrifice of identification accuracy, only depending on small labeled traffic samples and highly available unlabeled traffic samples. Specifically, focusing on local short-term characteristics of traffic, we design a preprocessing algorithm, termed as N-phrase Extration, to convert unlabeled raw traffic dataset into sequences of high-frequency phrases as input of Bidirectional Encoder. On account of their significance, potential timing characteristics from input sequences are mined by Bidirectional Encoder and embedded into robust representations with distributed vectors to enhance classifier’s performance significantly. Our comprehensive experiments indicate FLAG can achieve 98.65% in 100% of dataset and 98.07% in 10% of dataset in terms of true positive rate in UNB ISCX VPN-nonVPN dataset, which are better than p-FP, FS-Net and Deep Packet.
FLAG:基于自监督学习的流量表示生成器
深度学习(deep learning, DL)由于具有从大规模原始数据中学习特征的优异能力,在加密流量分类中备受关注。然而,大多数基于dl的流量分类器通常依赖于大量的标记样本。基于此,我们研究了一种不牺牲识别精度的自监督流量分类器(FLAG),仅依赖于小的标记流量样本和高可用的未标记流量样本。具体而言,针对交通的局部短期特征,我们设计了一种预处理算法,称为n短语提取,将未标记的原始交通数据集转换为高频短语序列作为双向编码器的输入。考虑到潜在的时序特征的重要性,双向编码器挖掘输入序列的潜在时序特征,并将其嵌入到具有分布式向量的鲁棒表示中,以显著提高分类器的性能。综合实验表明,在UNB ISCX vpn -非vpn数据集上,FLAG的真阳性率在100%的数据集上达到98.65%,在10%的数据集上达到98.07%,优于p-FP、FS-Net和Deep Packet。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信