Pengcheng Xiong, Xin He, Hakan Hacıgümüş, P. Shenoy
{"title":"Cormorant: Running Analytic Queries on MapReduce with Collaborative Software-Defined Networking","authors":"Pengcheng Xiong, Xin He, Hakan Hacıgümüş, P. Shenoy","doi":"10.1109/HotWeb.2015.10","DOIUrl":null,"url":null,"abstract":"MapReduce is a popular choice for executing analytic workloads over large datasets on clusters of commodity machines. Due to the distributed nature of such systems, network resource bottlenecks can adversely affect performance, especially when multiple applications share the network. One of the significant barriers to reducing the occurrence and impact of such bottlenecks is the current separation between MapReduce and network management and routing. Fortunately, the emergence of software-defined networking (SDN) is removing the barriers to cooperation between Hadoop and the network. To explore the opportunity this creates, we focus on how we can use the capabilities of SDN to create a more collaborative relationship between MapReduce and the network underneath. We demonstrate the effectiveness of this collaboration through the implementation of and experiments with a system we call Cormorant. Experimental results show up to 38% improvement for analytic query performance, beyond the benefits achievable by independently optimizing MapReduce schedulers and network flow schedulers.","PeriodicalId":252318,"journal":{"name":"2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HotWeb.2015.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
MapReduce is a popular choice for executing analytic workloads over large datasets on clusters of commodity machines. Due to the distributed nature of such systems, network resource bottlenecks can adversely affect performance, especially when multiple applications share the network. One of the significant barriers to reducing the occurrence and impact of such bottlenecks is the current separation between MapReduce and network management and routing. Fortunately, the emergence of software-defined networking (SDN) is removing the barriers to cooperation between Hadoop and the network. To explore the opportunity this creates, we focus on how we can use the capabilities of SDN to create a more collaborative relationship between MapReduce and the network underneath. We demonstrate the effectiveness of this collaboration through the implementation of and experiments with a system we call Cormorant. Experimental results show up to 38% improvement for analytic query performance, beyond the benefits achievable by independently optimizing MapReduce schedulers and network flow schedulers.