Technical Perspective:: From Think Parallel to Think Sequential

SIGMOD Rec. Pub Date : 2018-09-10 DOI:10.1145/3277006.3277010

Z. Ives

{"title":"Technical Perspective:: From Think Parallel to Think Sequential","authors":"Z. Ives","doi":"10.1145/3277006.3277010","DOIUrl":null,"url":null,"abstract":"In recent years, the database and distributed systems communities have built a wide variety of runtime systems and programming models for largescale computing over graphs. Such “big graph processing systems” [1, 2, 4, 5, 7] o support highly scalable parallel execution of graph algorithms — e.g., computing shortest paths, graph centrality, connected components, or perhaps even graph clusters. As described in the excellent survey by Yan et al [6], most big graph processing systems require the programmer to adopt a vertex-centric or block-centric programming model. For the former, code only “sees” the state at one vertex, receives messages from other vertices, and can send messages to other vertices. Under the latter, code manages a set of vertices within a subgraph (“block”) and can communicate with the code managing other blocks. In “From think Parallel to Think Sequential,” Fan and colleagues argue that vertexand blockcentric programming models are not natural for programmers trained to think sequentially. Instead, they argue that a more intuitive programming model can be developed out of several very simple primitives that can be composed to do incremental computation (as has also been studied in more general “big data” systems [4, 3]). The authors propose four elegant building blocks: (1) a partial evaluation function, (2) an incremental update handling function, (3) mechanisms for updating and sharing parameters in global fashion, and (4) an aggregate function for when multiple workers are updating the same parameter. They build the GRAPE GRAPh Engine system, which implements this programming model, and they show that it provides excellent performance for a variety of graph algorithms. The paper presents a compelling case that, at least for certain classes of algorithms, the simple primitives may be both more natural and more amenable to optimization than standard vertex-centric approaches.","PeriodicalId":21740,"journal":{"name":"SIGMOD Rec.","volume":"1 1","pages":"14"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIGMOD Rec.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3277006.3277010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, the database and distributed systems communities have built a wide variety of runtime systems and programming models for largescale computing over graphs. Such “big graph processing systems” [1, 2, 4, 5, 7] o support highly scalable parallel execution of graph algorithms — e.g., computing shortest paths, graph centrality, connected components, or perhaps even graph clusters. As described in the excellent survey by Yan et al [6], most big graph processing systems require the programmer to adopt a vertex-centric or block-centric programming model. For the former, code only “sees” the state at one vertex, receives messages from other vertices, and can send messages to other vertices. Under the latter, code manages a set of vertices within a subgraph (“block”) and can communicate with the code managing other blocks. In “From think Parallel to Think Sequential,” Fan and colleagues argue that vertexand blockcentric programming models are not natural for programmers trained to think sequentially. Instead, they argue that a more intuitive programming model can be developed out of several very simple primitives that can be composed to do incremental computation (as has also been studied in more general “big data” systems [4, 3]). The authors propose four elegant building blocks: (1) a partial evaluation function, (2) an incremental update handling function, (3) mechanisms for updating and sharing parameters in global fashion, and (4) an aggregate function for when multiple workers are updating the same parameter. They build the GRAPE GRAPh Engine system, which implements this programming model, and they show that it provides excellent performance for a variety of graph algorithms. The paper presents a compelling case that, at least for certain classes of algorithms, the simple primitives may be both more natural and more amenable to optimization than standard vertex-centric approaches.

查看原文本刊更多论文

技术角度:从平行思考到顺序思考

近年来，数据库和分布式系统社区已经为图上的大规模计算构建了各种各样的运行时系统和编程模型。这样的“大图处理系统”[1,2,4,5,7]支持高度可扩展的图算法并行执行——例如，计算最短路径、图中心性、连接组件，甚至可能是图集群。正如Yan等人[6]的优秀调查所描述的那样，大多数大型图形处理系统要求程序员采用以顶点为中心或以块为中心的编程模型。对于前者，代码只“看到”一个顶点的状态，接收来自其他顶点的消息，并可以向其他顶点发送消息。在后者下，代码管理子图(“块”)中的一组顶点，并可以与管理其他块的代码通信。在“从并行思考到顺序思考”一文中，范和他的同事认为顶点和以块为中心的编程模型对于被训练成顺序思考的程序员来说是不自然的。相反，他们认为可以从几个非常简单的原语中开发出更直观的编程模型，这些原语可以组合起来进行增量计算(正如在更一般的“大数据”系统中所研究的那样[4,3])。作者提出了四个优雅的构建模块:(1)部分评估函数，(2)增量更新处理函数，(3)以全局方式更新和共享参数的机制，以及(4)当多个工作人员更新相同参数时的聚合函数。他们构建了实现该编程模型的GRAPE图引擎系统，并证明了它对各种图算法都有很好的性能。本文提出了一个令人信服的案例，至少对于某些类型的算法，简单原语可能比标准的以顶点为中心的方法更自然，更易于优化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

SIGMOD Rec.

自引率

0.00%

发文量