Parallel graph algorithms by blocks: from I/O to algorithms

Abdurrahman Yasar, Kasimir Gabert, Ümit V. Çatalyürek
{"title":"Parallel graph algorithms by blocks: from I/O to algorithms","authors":"Abdurrahman Yasar, Kasimir Gabert, Ümit V. Çatalyürek","doi":"10.1145/3457388.3459987","DOIUrl":null,"url":null,"abstract":"In today's data-driven world and heterogeneous computing environments, processing large-scale graphs in an architecture agnostic manner has become more crucial than ever before. In terms of graph analytics frameworks, on the one side, there has been a significant interest in developing hand-optimized high-performance computing solutions. On the systems side, following the big data movement and to bring parallel computing to the masses, researchers have proposed several graph processing and management systems to handle large-scale graphs. Hand optimized HPC approaches require high expertise and are expensive to maintain and develop, and graph processing frameworks suffer from limited expressibility and performance. We propose Parallel Graph Algorithms by Blocks (PGAbB), a block-based graph algorithms framework for shared-memory, multi-core, multi-GPU machines. PGAbB offers a sweet spot between efficient parallelism and architecture agnostic algorithm design for a wide class of graph problems while performing close to hand-optimized HPC implementations. While our PGAbB framework, as well as many other recent HPC graph-analytics frameworks, are highly tuned and able to run complex graph analytics in fractions of seconds on billion-edge graphs, there remains a gap in their end-to-end use. Despite the significant improvements that modern hardware and operating systems have made towards input and output, reading the graph from file systems easily takes thousands of times longer than running the computational kernel itself. This slowdown causes both a disconnect for end users and a loss of productivity for researchers and developers. We close this gap by providing a simple to use, small, header-only, and dependency-free C++11 library, PIGO, that brings I/O improvements to graph and sparse matrix systems. Using PIGO, we improve the end-to-end performance for state-of-the-art systems significantly---in many cases by over 40X.","PeriodicalId":136482,"journal":{"name":"Proceedings of the 18th ACM International Conference on Computing Frontiers","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 18th ACM International Conference on Computing Frontiers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3457388.3459987","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In today's data-driven world and heterogeneous computing environments, processing large-scale graphs in an architecture agnostic manner has become more crucial than ever before. In terms of graph analytics frameworks, on the one side, there has been a significant interest in developing hand-optimized high-performance computing solutions. On the systems side, following the big data movement and to bring parallel computing to the masses, researchers have proposed several graph processing and management systems to handle large-scale graphs. Hand optimized HPC approaches require high expertise and are expensive to maintain and develop, and graph processing frameworks suffer from limited expressibility and performance. We propose Parallel Graph Algorithms by Blocks (PGAbB), a block-based graph algorithms framework for shared-memory, multi-core, multi-GPU machines. PGAbB offers a sweet spot between efficient parallelism and architecture agnostic algorithm design for a wide class of graph problems while performing close to hand-optimized HPC implementations. While our PGAbB framework, as well as many other recent HPC graph-analytics frameworks, are highly tuned and able to run complex graph analytics in fractions of seconds on billion-edge graphs, there remains a gap in their end-to-end use. Despite the significant improvements that modern hardware and operating systems have made towards input and output, reading the graph from file systems easily takes thousands of times longer than running the computational kernel itself. This slowdown causes both a disconnect for end users and a loss of productivity for researchers and developers. We close this gap by providing a simple to use, small, header-only, and dependency-free C++11 library, PIGO, that brings I/O improvements to graph and sparse matrix systems. Using PIGO, we improve the end-to-end performance for state-of-the-art systems significantly---in many cases by over 40X.
并行图算法块:从I/O到算法
在当今数据驱动的世界和异构计算环境中,以与体系结构无关的方式处理大规模图变得比以往任何时候都更加重要。在图形分析框架方面,一方面,人们对开发手动优化的高性能计算解决方案非常感兴趣。在系统方面,随着大数据的发展和并行计算的普及,研究人员提出了几种图形处理和管理系统来处理大规模的图形。手工优化的高性能计算方法需要很高的专业知识,维护和开发成本很高,图形处理框架的可表达性和性能有限。我们提出并行图算法块(PGAbB),一个基于块的图算法框架,用于共享内存,多核,多gpu机器。PGAbB提供了一个介于高效并行性和架构无关的算法设计之间的最佳点,用于广泛的图形问题,同时执行接近手动优化的HPC实现。虽然我们的PGAbB框架,以及许多其他最近的HPC图形分析框架,都是高度调整的,能够在几秒钟内对十亿边的图形运行复杂的图形分析,但它们的端到端使用仍然存在差距。尽管现代硬件和操作系统在输入和输出方面有了很大的改进,但是从文件系统读取图形的时间比运行计算内核本身要长几千倍。这种减速既会导致最终用户的脱节,也会导致研究人员和开发人员的生产力下降。我们提供了一个易于使用的、小型的、仅限头文件的、无依赖的c++ 11库PIGO,从而缩小了这一差距。PIGO为图形和稀疏矩阵系统带来了I/O改进。使用PIGO,我们显著提高了最先进系统的端到端性能,在许多情况下提高了40倍以上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信