在gpu上计算Jaccard权重的度感知核

2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2022-05-01 DOI:10.1109/ipdps53621.2022.00092

Amro Alabsi Aljundi, Taha Atahan Akyildiz, K. Kaya

{"title":"在gpu上计算Jaccard权重的度感知核","authors":"Amro Alabsi Aljundi, Taha Atahan Akyildiz, K. Kaya","doi":"10.1109/ipdps53621.2022.00092","DOIUrl":null,"url":null,"abstract":"Graphs provide the ability to extract valuable met-rics from the structural properties of the underlying data they represent. One such metric is the Jaccard Weight of an edge, which is the ratio of the number of common neighbors of the edge's endpoints to the union of the endpoints' neighborhood. A naive implementation of Jaccard Weights computation has a complexity that scales with the number of edges in the graph times the square of the maximum degree. Recently, GPU-based parallel algorithms have been proposed for this problem. How-ever, these algorithms cannot overcome the structural variance within a graph, i.e., the sparsity pattern and degree imbalance, which directly translates to unbalanced work distribution across threads. In this work, we propose an optimized GPU-based algorithm with an ML-based work distribution model that mitigates the unbalanced work distribution. Our algorithm is shown to be up to 35x and on average 12x faster than the state of the art in practice while using less memory. In fact, we show that by manually tweaking the load distribution, a state-of-the-art implementation can be 5x faster. In addition, we propose a multi-core, shared-memory algorithm that applies a traditional but effective technique to improve the computation asymptotically and perform comparably to the GPU algorithms. Our code is available at https://github.com/SU-HPC/Jaccard-ML.","PeriodicalId":321801,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Degree-Aware Kernels for Computing Jaccard Weights on GPUs\",\"authors\":\"Amro Alabsi Aljundi, Taha Atahan Akyildiz, K. Kaya\",\"doi\":\"10.1109/ipdps53621.2022.00092\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graphs provide the ability to extract valuable met-rics from the structural properties of the underlying data they represent. One such metric is the Jaccard Weight of an edge, which is the ratio of the number of common neighbors of the edge's endpoints to the union of the endpoints' neighborhood. A naive implementation of Jaccard Weights computation has a complexity that scales with the number of edges in the graph times the square of the maximum degree. Recently, GPU-based parallel algorithms have been proposed for this problem. How-ever, these algorithms cannot overcome the structural variance within a graph, i.e., the sparsity pattern and degree imbalance, which directly translates to unbalanced work distribution across threads. In this work, we propose an optimized GPU-based algorithm with an ML-based work distribution model that mitigates the unbalanced work distribution. Our algorithm is shown to be up to 35x and on average 12x faster than the state of the art in practice while using less memory. In fact, we show that by manually tweaking the load distribution, a state-of-the-art implementation can be 5x faster. In addition, we propose a multi-core, shared-memory algorithm that applies a traditional but effective technique to improve the computation asymptotically and perform comparably to the GPU algorithms. Our code is available at https://github.com/SU-HPC/Jaccard-ML.\",\"PeriodicalId\":321801,\"journal\":{\"name\":\"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ipdps53621.2022.00092\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ipdps53621.2022.00092","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

图提供了从它们所表示的底层数据的结构属性中提取有价值的度量的能力。一个这样的度量是边的雅卡德权重，它是边端点的共同邻居的数量与端点邻居的并集的比率。Jaccard权重计算的简单实现具有复杂度，复杂度随图中的边数乘以最大度的平方而增加。最近，人们提出了基于gpu的并行算法来解决这个问题。然而，这些算法不能克服图内的结构差异，即稀疏模式和程度不平衡，这直接转化为线程间工作分配的不平衡。在这项工作中，我们提出了一种优化的基于gpu的算法和基于ml的工作分配模型，以减轻工作分配的不平衡。我们的算法在使用更少内存的情况下，比目前的技术水平快35倍，平均快12倍。事实上，我们展示了通过手动调整负载分布，最先进的实现可以快5倍。此外，我们提出了一种多核共享内存算法，该算法采用传统但有效的技术来渐进地提高计算速度，并且性能与GPU算法相当。我们的代码可在https://github.com/SU-HPC/Jaccard-ML上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Degree-Aware Kernels for Computing Jaccard Weights on GPUs

Graphs provide the ability to extract valuable met-rics from the structural properties of the underlying data they represent. One such metric is the Jaccard Weight of an edge, which is the ratio of the number of common neighbors of the edge's endpoints to the union of the endpoints' neighborhood. A naive implementation of Jaccard Weights computation has a complexity that scales with the number of edges in the graph times the square of the maximum degree. Recently, GPU-based parallel algorithms have been proposed for this problem. How-ever, these algorithms cannot overcome the structural variance within a graph, i.e., the sparsity pattern and degree imbalance, which directly translates to unbalanced work distribution across threads. In this work, we propose an optimized GPU-based algorithm with an ML-based work distribution model that mitigates the unbalanced work distribution. Our algorithm is shown to be up to 35x and on average 12x faster than the state of the art in practice while using less memory. In fact, we show that by manually tweaking the load distribution, a state-of-the-art implementation can be 5x faster. In addition, we propose a multi-core, shared-memory algorithm that applies a traditional but effective technique to improve the computation asymptotically and perform comparably to the GPU algorithms. Our code is available at https://github.com/SU-HPC/Jaccard-ML.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量