Kylix:商品簇的稀疏Allreduce

2014 43rd International Conference on Parallel Processing Pub Date : 2014-10-18 DOI:10.1109/ICPP.2014.36

Huasha Zhao, J. Canny

{"title":"Kylix:商品簇的稀疏Allreduce","authors":"Huasha Zhao, J. Canny","doi":"10.1109/ICPP.2014.36","DOIUrl":null,"url":null,"abstract":"Allreduce is a basic building block for parallel computing. Our target here is \"Big Data\" processing on commodity clusters (mostly sparse power-law data). Allreduce can be used to synchronize models, to maintain distributed datasets, and to perform operations on distributed data such as sparse matrix multiply. We first review a key constraint on cluster communication, the minimum efficient packet size, which hampers the use of direct all-to-all protocols on large networks. Our allreduce network is a nested, heterogeneous-degree butterfly. We show that communication volume in lower layers is typically much less than the top layer, and total communication across all layers a small constant larger than the top layer, which is close to optimal. A chart of network communication volume across layers has a characteristic \"Kylix\" shape, which gives the method its name. For optimum performance, the butterfly degrees also decrease down the layers. Furthermore, to efficiently route sparse updates to the nodes that need them, the network must be nested. While the approach is amenable to various kinds of sparse data, almost all \"Big Data\" sets show power-law statistics, and from the properties of these, we derive methods for optimal network design. Finally, we present experiments showing with Kylix on Amazon EC2 and demonstrating significant improvements over existing systems such as PowerGraph and Hadoop.","PeriodicalId":441115,"journal":{"name":"2014 43rd International Conference on Parallel Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":"{\"title\":\"Kylix: A Sparse Allreduce for Commodity Clusters\",\"authors\":\"Huasha Zhao, J. Canny\",\"doi\":\"10.1109/ICPP.2014.36\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Allreduce is a basic building block for parallel computing. Our target here is \\\"Big Data\\\" processing on commodity clusters (mostly sparse power-law data). Allreduce can be used to synchronize models, to maintain distributed datasets, and to perform operations on distributed data such as sparse matrix multiply. We first review a key constraint on cluster communication, the minimum efficient packet size, which hampers the use of direct all-to-all protocols on large networks. Our allreduce network is a nested, heterogeneous-degree butterfly. We show that communication volume in lower layers is typically much less than the top layer, and total communication across all layers a small constant larger than the top layer, which is close to optimal. A chart of network communication volume across layers has a characteristic \\\"Kylix\\\" shape, which gives the method its name. For optimum performance, the butterfly degrees also decrease down the layers. Furthermore, to efficiently route sparse updates to the nodes that need them, the network must be nested. While the approach is amenable to various kinds of sparse data, almost all \\\"Big Data\\\" sets show power-law statistics, and from the properties of these, we derive methods for optimal network design. Finally, we present experiments showing with Kylix on Amazon EC2 and demonstrating significant improvements over existing systems such as PowerGraph and Hadoop.\",\"PeriodicalId\":441115,\"journal\":{\"name\":\"2014 43rd International Conference on Parallel Processing\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"36\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 43rd International Conference on Parallel Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPP.2014.36\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 43rd International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2014.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 36

摘要

Allreduce是并行计算的基本构建块。我们的目标是在商品集群(主要是稀疏幂律数据)上进行“大数据”处理。Allreduce可以用于同步模型，维护分布式数据集，并对分布式数据执行操作，如稀疏矩阵乘法。我们首先回顾了集群通信的一个关键约束，即最小有效数据包大小，它阻碍了在大型网络上直接使用全对全协议。我们的allreduce网络是一个嵌套的、异构度的蝴蝶。我们发现，底层的通信量通常远小于顶层，并且所有层之间的总通信量比顶层大一个小常数，这接近于最优。跨层网络通信量的图表具有典型的“Kylix”形状，因此该方法得名。为了获得最佳性能，蝶度也会随着层数的减小而减小。此外，为了有效地将稀疏更新路由到需要它们的节点，网络必须嵌套。虽然该方法适用于各种稀疏数据，但几乎所有的“大数据”集都显示幂律统计，并且从这些属性中，我们推导出最优网络设计的方法。最后，我们展示了Kylix在Amazon EC2上的实验，并展示了对现有系统(如PowerGraph和Hadoop)的重大改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Kylix: A Sparse Allreduce for Commodity Clusters

Allreduce is a basic building block for parallel computing. Our target here is "Big Data" processing on commodity clusters (mostly sparse power-law data). Allreduce can be used to synchronize models, to maintain distributed datasets, and to perform operations on distributed data such as sparse matrix multiply. We first review a key constraint on cluster communication, the minimum efficient packet size, which hampers the use of direct all-to-all protocols on large networks. Our allreduce network is a nested, heterogeneous-degree butterfly. We show that communication volume in lower layers is typically much less than the top layer, and total communication across all layers a small constant larger than the top layer, which is close to optimal. A chart of network communication volume across layers has a characteristic "Kylix" shape, which gives the method its name. For optimum performance, the butterfly degrees also decrease down the layers. Furthermore, to efficiently route sparse updates to the nodes that need them, the network must be nested. While the approach is amenable to various kinds of sparse data, almost all "Big Data" sets show power-law statistics, and from the properties of these, we derive methods for optimal network design. Finally, we present experiments showing with Kylix on Amazon EC2 and demonstrating significant improvements over existing systems such as PowerGraph and Hadoop.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 43rd International Conference on Parallel Processing

自引率

0.00%

发文量