GPU-based butterfly counting

Yifei Xia, Feng Zhang, Qingyu Xu, Mingde Zhang, Zhiming Yao, Lv Lu, Xiaoyong Du, Dong Deng, Bingsheng He, Siqi Ma
{"title":"GPU-based butterfly counting","authors":"Yifei Xia, Feng Zhang, Qingyu Xu, Mingde Zhang, Zhiming Yao, Lv Lu, Xiaoyong Du, Dong Deng, Bingsheng He, Siqi Ma","doi":"10.1007/s00778-024-00861-0","DOIUrl":null,"url":null,"abstract":"<p>When dealing with large bipartite graphs, butterfly counting is a crucial and time-consuming operation. Graphics processing units (GPUs) are widely used parallel heterogeneous devices that can significantly boost performance for data science programs. However, currently no work enables efficient butterfly counting on GPU. To fill this gap, we propose a GPU-based butterfly counting method, called G-BFC. G-BFC solves three significant technical problems. First, butterfly counting involves massive serial operations, which leads to severe synchronization overheads and performance degradation. We unlock the serial region and utilize the shared memory on GPU to efficiently handle it. Second, butterfly counting on GPU faces the workload imbalance problem. To maximize efficiency, we develop a novel adaptive strategy to balance the workload among threads. Third, the large number of two-hop paths, also known as wedges, in bipartite graphs make parallel butterfly counting difficult to traverse. We develop an innovative preprocessing strategy that can significantly cut down on the required number of wedges. We conduct comprehensive experiments on both server-grade and edge-grade GPU platforms, and experiments show that G-BFC brings significant performance benefits. G-BFC achieves 4.84<span>\\(\\times \\)</span> performance speedup over the state-of-the-art solution on eleven real-world datasets.</p>","PeriodicalId":501532,"journal":{"name":"The VLDB Journal","volume":"24 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The VLDB Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00778-024-00861-0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

When dealing with large bipartite graphs, butterfly counting is a crucial and time-consuming operation. Graphics processing units (GPUs) are widely used parallel heterogeneous devices that can significantly boost performance for data science programs. However, currently no work enables efficient butterfly counting on GPU. To fill this gap, we propose a GPU-based butterfly counting method, called G-BFC. G-BFC solves three significant technical problems. First, butterfly counting involves massive serial operations, which leads to severe synchronization overheads and performance degradation. We unlock the serial region and utilize the shared memory on GPU to efficiently handle it. Second, butterfly counting on GPU faces the workload imbalance problem. To maximize efficiency, we develop a novel adaptive strategy to balance the workload among threads. Third, the large number of two-hop paths, also known as wedges, in bipartite graphs make parallel butterfly counting difficult to traverse. We develop an innovative preprocessing strategy that can significantly cut down on the required number of wedges. We conduct comprehensive experiments on both server-grade and edge-grade GPU platforms, and experiments show that G-BFC brings significant performance benefits. G-BFC achieves 4.84\(\times \) performance speedup over the state-of-the-art solution on eleven real-world datasets.

Abstract Image

基于 GPU 的蝴蝶计数
在处理大型二叉图时,蝴蝶计数是一项关键且耗时的操作。图形处理器(GPU)是广泛使用的并行异构设备,能显著提高数据科学程序的性能。然而,目前还没有任何工作能在 GPU 上实现高效的蝴蝶计数。为了填补这一空白,我们提出了一种基于 GPU 的蝴蝶计数方法,称为 G-BFC。G-BFC 解决了三个重大技术问题。首先,蝴蝶计数涉及大量串行操作,这会导致严重的同步开销和性能下降。我们解锁了串行区域,并利用 GPU 上的共享内存来高效处理它。其次,GPU 上的蝴蝶计数面临工作量不平衡问题。为了最大限度地提高效率,我们开发了一种新颖的自适应策略来平衡线程之间的工作量。第三,双向图中存在大量双跳路径(也称为楔形),这使得并行蝴蝶计数难以遍历。我们开发了一种创新的预处理策略,可以大大减少所需的楔形路径数量。我们在服务器级和边缘级 GPU 平台上进行了全面的实验,实验表明 G-BFC 带来了显著的性能优势。在11个实际数据集上,G-BFC的性能比最先进的解决方案提高了4.84(倍)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信