Exploring Textures in Traffic Matrices to Classify Data Center Communications

2018 IEEE 32nd International Conference on Advanced Information Networking and Applications (AINA) Pub Date : 2018-08-09 DOI:10.1109/AINA.2018.00161

Celio Trois, L. C. E. Bona, Luiz Oliveira, M. Martinello, D. Harewood-Gill, Marcos Didonet Del Fabro, R. Nejabati, D. Simeonidou, J. C. D. Lima, B. Stein

{"title":"Exploring Textures in Traffic Matrices to Classify Data Center Communications","authors":"Celio Trois, L. C. E. Bona, Luiz Oliveira, M. Martinello, D. Harewood-Gill, Marcos Didonet Del Fabro, R. Nejabati, D. Simeonidou, J. C. D. Lima, B. Stein","doi":"10.1109/AINA.2018.00161","DOIUrl":null,"url":null,"abstract":"Data analytics and scientific computing are two modern applications that in recent years have substantially changed their computation and communication needs, requiring additional processing capability and bandwidth to be able to keep pace with current demands. These applications are commonly processed within data centers, exchanging enormous volumes of data, rapidly stressing existing network infrastructures. Thus, it is crucial for data center operations and management to be able to understand and classify the communication demands of these applications. The traditional approaches for classifying application traffic are port-based and Deep Packet Inspection, both presenting issues with current network technology. Some recent works propose using machine learning plus statistical information collected from application flows to classify traffic. Applications running in data centers present communication patterns which can be recognized through their traffic matrices. So, the main contribution of this paper is a method that explores the textural information extracted from these matrices to classify the data center traffic using machine learning techniques. As a proof-of-concept, we implemented this method in a system named DCTraCS. The experimental dataset was gathered from two real data centers, collecting the traffic matrices of MapReduce and a set of scientific applications every second for a period of 30 minutes. For assessing our proposal, we compared it with other machine learning techniques for classifying application traffic found in current literature. Results show that our approach achieved the highest accuracy, classifying correctly over 99% of our data center applications.","PeriodicalId":239730,"journal":{"name":"2018 IEEE 32nd International Conference on Advanced Information Networking and Applications (AINA)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 32nd International Conference on Advanced Information Networking and Applications (AINA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AINA.2018.00161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Data analytics and scientific computing are two modern applications that in recent years have substantially changed their computation and communication needs, requiring additional processing capability and bandwidth to be able to keep pace with current demands. These applications are commonly processed within data centers, exchanging enormous volumes of data, rapidly stressing existing network infrastructures. Thus, it is crucial for data center operations and management to be able to understand and classify the communication demands of these applications. The traditional approaches for classifying application traffic are port-based and Deep Packet Inspection, both presenting issues with current network technology. Some recent works propose using machine learning plus statistical information collected from application flows to classify traffic. Applications running in data centers present communication patterns which can be recognized through their traffic matrices. So, the main contribution of this paper is a method that explores the textural information extracted from these matrices to classify the data center traffic using machine learning techniques. As a proof-of-concept, we implemented this method in a system named DCTraCS. The experimental dataset was gathered from two real data centers, collecting the traffic matrices of MapReduce and a set of scientific applications every second for a period of 30 minutes. For assessing our proposal, we compared it with other machine learning techniques for classifying application traffic found in current literature. Results show that our approach achieved the highest accuracy, classifying correctly over 99% of our data center applications.

查看原文本刊更多论文

探索流量矩阵中的纹理以分类数据中心通信

数据分析和科学计算是两种现代应用，近年来它们的计算和通信需求发生了很大的变化，需要额外的处理能力和带宽才能跟上当前需求的步伐。这些应用程序通常在数据中心内处理，交换大量数据，迅速给现有的网络基础设施带来压力。因此，能够理解和分类这些应用程序的通信需求对于数据中心的运营和管理至关重要。传统的应用流量分类方法是基于端口和深度包检测，这两种方法都存在着当前网络技术的问题。最近的一些工作建议使用机器学习和从应用程序流中收集的统计信息来对流量进行分类。在数据中心运行的应用程序呈现的通信模式可以通过它们的流量矩阵来识别。因此，本文的主要贡献是利用机器学习技术探索从这些矩阵中提取的纹理信息来对数据中心流量进行分类的方法。作为概念验证，我们在一个名为DCTraCS的系统中实现了该方法。实验数据集采集自两个真实数据中心，每秒采集MapReduce和一组科学应用的流量矩阵，采集时间为30分钟。为了评估我们的提案，我们将其与当前文献中发现的用于分类应用程序流量的其他机器学习技术进行了比较。结果表明，我们的方法达到了最高的准确率，对99%以上的数据中心应用进行了正确的分类。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE 32nd International Conference on Advanced Information Networking and Applications (AINA)

自引率

0.00%

发文量