利用gpu上独立线程调度的Jaccard权重内核

2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) Pub Date : 2018-09-01 DOI:10.1109/CAHPC.2018.8645946

H. Anzt, J. Dongarra

{"title":"利用gpu上独立线程调度的Jaccard权重内核","authors":"H. Anzt, J. Dongarra","doi":"10.1109/CAHPC.2018.8645946","DOIUrl":null,"url":null,"abstract":"Jaccard weights are a popular metric for identifying communities in social network analytics. In this paper we present a kernel for efficiently computing the Jaccard weight matrix on G PU s. The kernel design is guided by fine-grained parallelism and the independent thread scheduling supported by NVIDIA's Volta architecture. This technology makes it possible to interleave the execution of divergent branches for enhanced data reuse and a higher instruction per cycle rate for memory-bound algorithms. In a performance evaluation using a set of publicly available social networks, we report the kernel execution time and analyze the built-in hardware counters on different GPU architectures. The findings have implications beyond the specific algorithm and suggest a reformulation of other data-sparse algorithms.","PeriodicalId":307747,"journal":{"name":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A Jaccard Weights Kernel Leveraging Independent Thread Scheduling on GPUs\",\"authors\":\"H. Anzt, J. Dongarra\",\"doi\":\"10.1109/CAHPC.2018.8645946\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Jaccard weights are a popular metric for identifying communities in social network analytics. In this paper we present a kernel for efficiently computing the Jaccard weight matrix on G PU s. The kernel design is guided by fine-grained parallelism and the independent thread scheduling supported by NVIDIA's Volta architecture. This technology makes it possible to interleave the execution of divergent branches for enhanced data reuse and a higher instruction per cycle rate for memory-bound algorithms. In a performance evaluation using a set of publicly available social networks, we report the kernel execution time and analyze the built-in hardware counters on different GPU architectures. The findings have implications beyond the specific algorithm and suggest a reformulation of other data-sparse algorithms.\",\"PeriodicalId\":307747,\"journal\":{\"name\":\"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CAHPC.2018.8645946\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAHPC.2018.8645946","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在社交网络分析中，Jaccard权重是识别社区的一种流行度量。本文提出了一种在gpu上高效计算Jaccard权矩阵的内核，该内核设计以细粒度并行性和NVIDIA的Volta架构支持的独立线程调度为指导。这种技术使得不同分支的交错执行成为可能，以增强数据重用，并为内存约束算法提供更高的每周期指令率。在使用一组公开可用的社交网络进行性能评估时，我们报告了内核执行时间并分析了不同GPU架构上的内置硬件计数器。这些发现的影响超出了特定的算法，并建议重新制定其他数据稀疏算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Jaccard Weights Kernel Leveraging Independent Thread Scheduling on GPUs

Jaccard weights are a popular metric for identifying communities in social network analytics. In this paper we present a kernel for efficiently computing the Jaccard weight matrix on G PU s. The kernel design is guided by fine-grained parallelism and the independent thread scheduling supported by NVIDIA's Volta architecture. This technology makes it possible to interleave the execution of divergent branches for enhanced data reuse and a higher instruction per cycle rate for memory-bound algorithms. In a performance evaluation using a set of publicly available social networks, we report the kernel execution time and analyze the built-in hardware counters on different GPU architectures. The findings have implications beyond the specific algorithm and suggest a reformulation of other data-sparse algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

自引率

0.00%

发文量