使用每个源度量来提高互联网流量分类的性能

2010 IEEE Latin-American Conference on Communications Pub Date : 2010-11-18 DOI:10.1109/LATINCOM.2010.5641015

S. Bregni, Diego Lucerna, C. Rottondi, G. Verticale

{"title":"使用每个源度量来提高互联网流量分类的性能","authors":"S. Bregni, Diego Lucerna, C. Rottondi, G. Verticale","doi":"10.1109/LATINCOM.2010.5641015","DOIUrl":null,"url":null,"abstract":"Obfuscated and encrypted protocols hinder traffic classification by classical techniques such as port analysis or deep packet inspection. Therefore, there is growing interest for classification algorithms based on statistical analysis of the length of the first packets of flows. Most classifiers proposed in literature are based on machine learning techniques and consider each flow independently of previous source activity (per-flow analysis). In this paper, we propose to use specific per-source information to improve classification accuracy: the sequence of starting times of flows generated by single sources may be analyzed along time to estimate peculiar statistical parameters, in our case the exponent α of the power law ƒ−α that approximates the PSD of their counting process. In our method, this measurement is used to train a classifier in addition to the lengths of the first packets of the flows. In our experiments, considering this additional per-source information yielded the same accuracy as using only per-flow data, but observing fewer packets in each flow and thus allowing a quicker response. For the proposed classifier, we report performance evaluation results obtained on sets of Internet traffic traces collected in three sites.","PeriodicalId":308819,"journal":{"name":"2010 IEEE Latin-American Conference on Communications","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Using per-Source measurements to improve performance of Internet traffic classification\",\"authors\":\"S. Bregni, Diego Lucerna, C. Rottondi, G. Verticale\",\"doi\":\"10.1109/LATINCOM.2010.5641015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Obfuscated and encrypted protocols hinder traffic classification by classical techniques such as port analysis or deep packet inspection. Therefore, there is growing interest for classification algorithms based on statistical analysis of the length of the first packets of flows. Most classifiers proposed in literature are based on machine learning techniques and consider each flow independently of previous source activity (per-flow analysis). In this paper, we propose to use specific per-source information to improve classification accuracy: the sequence of starting times of flows generated by single sources may be analyzed along time to estimate peculiar statistical parameters, in our case the exponent α of the power law ƒ−α that approximates the PSD of their counting process. In our method, this measurement is used to train a classifier in addition to the lengths of the first packets of the flows. In our experiments, considering this additional per-source information yielded the same accuracy as using only per-flow data, but observing fewer packets in each flow and thus allowing a quicker response. For the proposed classifier, we report performance evaluation results obtained on sets of Internet traffic traces collected in three sites.\",\"PeriodicalId\":308819,\"journal\":{\"name\":\"2010 IEEE Latin-American Conference on Communications\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE Latin-American Conference on Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/LATINCOM.2010.5641015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE Latin-American Conference on Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LATINCOM.2010.5641015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

混淆和加密的协议阻碍了传统的流量分类技术，如端口分析或深度包检测。因此，人们对基于流的第一个数据包长度的统计分析的分类算法越来越感兴趣。文献中提出的大多数分类器都是基于机器学习技术，并且独立于以前的源活动(每流分析)来考虑每个流。在本文中，我们建议使用特定的每源信息来提高分类精度:单个源产生的流的启动时间序列可以随着时间的推移进行分析，以估计特殊的统计参数，在我们的例子中，幂律的指数α近似于它们计数过程的PSD。在我们的方法中，除了流的第一个数据包的长度之外，该测量还用于训练分类器。在我们的实验中，考虑这个额外的每个源信息产生了与仅使用每个流数据相同的准确性，但在每个流中观察到更少的数据包，从而允许更快的响应。对于所提出的分类器，我们报告了在三个站点收集的互联网流量痕迹集上获得的性能评估结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Using per-Source measurements to improve performance of Internet traffic classification

Obfuscated and encrypted protocols hinder traffic classification by classical techniques such as port analysis or deep packet inspection. Therefore, there is growing interest for classification algorithms based on statistical analysis of the length of the first packets of flows. Most classifiers proposed in literature are based on machine learning techniques and consider each flow independently of previous source activity (per-flow analysis). In this paper, we propose to use specific per-source information to improve classification accuracy: the sequence of starting times of flows generated by single sources may be analyzed along time to estimate peculiar statistical parameters, in our case the exponent α of the power law ƒ−α that approximates the PSD of their counting process. In our method, this measurement is used to train a classifier in addition to the lengths of the first packets of the flows. In our experiments, considering this additional per-source information yielded the same accuracy as using only per-flow data, but observing fewer packets in each flow and thus allowing a quicker response. For the proposed classifier, we report performance evaluation results obtained on sets of Internet traffic traces collected in three sites.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2010 IEEE Latin-American Conference on Communications

自引率

0.00%

发文量