使用每个源度量来提高互联网流量分类的性能

S. Bregni, Diego Lucerna, C. Rottondi, G. Verticale
{"title":"使用每个源度量来提高互联网流量分类的性能","authors":"S. Bregni, Diego Lucerna, C. Rottondi, G. Verticale","doi":"10.1109/LATINCOM.2010.5641015","DOIUrl":null,"url":null,"abstract":"Obfuscated and encrypted protocols hinder traffic classification by classical techniques such as port analysis or deep packet inspection. Therefore, there is growing interest for classification algorithms based on statistical analysis of the length of the first packets of flows. Most classifiers proposed in literature are based on machine learning techniques and consider each flow independently of previous source activity (per-flow analysis). In this paper, we propose to use specific per-source information to improve classification accuracy: the sequence of starting times of flows generated by single sources may be analyzed along time to estimate peculiar statistical parameters, in our case the exponent α of the power law ƒ−α that approximates the PSD of their counting process. In our method, this measurement is used to train a classifier in addition to the lengths of the first packets of the flows. In our experiments, considering this additional per-source information yielded the same accuracy as using only per-flow data, but observing fewer packets in each flow and thus allowing a quicker response. For the proposed classifier, we report performance evaluation results obtained on sets of Internet traffic traces collected in three sites.","PeriodicalId":308819,"journal":{"name":"2010 IEEE Latin-American Conference on Communications","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Using per-Source measurements to improve performance of Internet traffic classification\",\"authors\":\"S. Bregni, Diego Lucerna, C. Rottondi, G. Verticale\",\"doi\":\"10.1109/LATINCOM.2010.5641015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Obfuscated and encrypted protocols hinder traffic classification by classical techniques such as port analysis or deep packet inspection. Therefore, there is growing interest for classification algorithms based on statistical analysis of the length of the first packets of flows. Most classifiers proposed in literature are based on machine learning techniques and consider each flow independently of previous source activity (per-flow analysis). In this paper, we propose to use specific per-source information to improve classification accuracy: the sequence of starting times of flows generated by single sources may be analyzed along time to estimate peculiar statistical parameters, in our case the exponent α of the power law ƒ−α that approximates the PSD of their counting process. In our method, this measurement is used to train a classifier in addition to the lengths of the first packets of the flows. In our experiments, considering this additional per-source information yielded the same accuracy as using only per-flow data, but observing fewer packets in each flow and thus allowing a quicker response. For the proposed classifier, we report performance evaluation results obtained on sets of Internet traffic traces collected in three sites.\",\"PeriodicalId\":308819,\"journal\":{\"name\":\"2010 IEEE Latin-American Conference on Communications\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE Latin-American Conference on Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/LATINCOM.2010.5641015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE Latin-American Conference on Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LATINCOM.2010.5641015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

混淆和加密的协议阻碍了传统的流量分类技术,如端口分析或深度包检测。因此,人们对基于流的第一个数据包长度的统计分析的分类算法越来越感兴趣。文献中提出的大多数分类器都是基于机器学习技术,并且独立于以前的源活动(每流分析)来考虑每个流。在本文中,我们建议使用特定的每源信息来提高分类精度:单个源产生的流的启动时间序列可以随着时间的推移进行分析,以估计特殊的统计参数,在我们的例子中,幂律的指数α近似于它们计数过程的PSD。在我们的方法中,除了流的第一个数据包的长度之外,该测量还用于训练分类器。在我们的实验中,考虑这个额外的每个源信息产生了与仅使用每个流数据相同的准确性,但在每个流中观察到更少的数据包,从而允许更快的响应。对于所提出的分类器,我们报告了在三个站点收集的互联网流量痕迹集上获得的性能评估结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Using per-Source measurements to improve performance of Internet traffic classification
Obfuscated and encrypted protocols hinder traffic classification by classical techniques such as port analysis or deep packet inspection. Therefore, there is growing interest for classification algorithms based on statistical analysis of the length of the first packets of flows. Most classifiers proposed in literature are based on machine learning techniques and consider each flow independently of previous source activity (per-flow analysis). In this paper, we propose to use specific per-source information to improve classification accuracy: the sequence of starting times of flows generated by single sources may be analyzed along time to estimate peculiar statistical parameters, in our case the exponent α of the power law ƒ−α that approximates the PSD of their counting process. In our method, this measurement is used to train a classifier in addition to the lengths of the first packets of the flows. In our experiments, considering this additional per-source information yielded the same accuracy as using only per-flow data, but observing fewer packets in each flow and thus allowing a quicker response. For the proposed classifier, we report performance evaluation results obtained on sets of Internet traffic traces collected in three sites.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信