Leandro Batista de Almeida, D. Magoni, Philip Perry, E. Almeida, John Murphy, Anthony Ventresque
{"title":"Multi-Layer-Mesh: A Novel Topology and SDN-Based Path Switching for Big Data Cluster Networks","authors":"Leandro Batista de Almeida, D. Magoni, Philip Perry, E. Almeida, John Murphy, Anthony Ventresque","doi":"10.1109/ICC.2019.8761785","DOIUrl":null,"url":null,"abstract":"Big Data technologies and tools have being used for the past decade to solve several scientific and industry problems, with Hadoop/YARN becoming the “de facto” standard for these applications, although other technologies run on top of it. As any other distributed application, those big data technologies rely heavily on the network infrastructure to read and move data from hundreds or thousands of cluster nodes. Although these technologies are based on reliable and efficient distributed algorithms, there are scenarios and conditions that can generate bottlenecks and inefficiencies, i.e., when a high number of concurrent users creates data access contention. In this paper, we propose a novel network topology called Multi-Layer-Mesh and a path switching algorithm based on SDN, that can increase the performance of a big data cluster while reducing the amount of utilized resources (network equipment), in turn reducing the energy and cooling consumption. A thorough simulation-based evaluation of our algorithms shows an average improvement in performance of 31.77% and an average decrease in resource utilization of 36.03% compared to a traditional Spine-Leaf topology, in the selected test scenarios.","PeriodicalId":402732,"journal":{"name":"ICC 2019 - 2019 IEEE International Conference on Communications (ICC)","volume":"514 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICC 2019 - 2019 IEEE International Conference on Communications (ICC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICC.2019.8761785","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Big Data technologies and tools have being used for the past decade to solve several scientific and industry problems, with Hadoop/YARN becoming the “de facto” standard for these applications, although other technologies run on top of it. As any other distributed application, those big data technologies rely heavily on the network infrastructure to read and move data from hundreds or thousands of cluster nodes. Although these technologies are based on reliable and efficient distributed algorithms, there are scenarios and conditions that can generate bottlenecks and inefficiencies, i.e., when a high number of concurrent users creates data access contention. In this paper, we propose a novel network topology called Multi-Layer-Mesh and a path switching algorithm based on SDN, that can increase the performance of a big data cluster while reducing the amount of utilized resources (network equipment), in turn reducing the energy and cooling consumption. A thorough simulation-based evaluation of our algorithms shows an average improvement in performance of 31.77% and an average decrease in resource utilization of 36.03% compared to a traditional Spine-Leaf topology, in the selected test scenarios.