基于Xeon Phi协处理器SIMD架构的无标度图形算法框架

2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP) Pub Date : 2017-07-01 DOI:10.1109/ASAP.2017.7995269

Jie Lin, Q. Wu, Yusong Tan, Jie Yu, Qi Zhang, Xiaoling Li, Lei Luo

{"title":"基于Xeon Phi协处理器SIMD架构的无标度图形算法框架","authors":"Jie Lin, Q. Wu, Yusong Tan, Jie Yu, Qi Zhang, Xiaoling Li, Lei Luo","doi":"10.1109/ASAP.2017.7995269","DOIUrl":null,"url":null,"abstract":"Graph algorithms currently play increasingly important roles, especially in social networks and language modeling scenarios. Recently, accelerating graph algorithms by heterogeneous high performance computers with the integrated cores and expanded SIMD lanes has been becoming the mainstream. However, the existing methods, restricted by the low-efficiency grouping strategy and the non-optimized selection mechanism of tile size of a graph, are far below our expectations in many ways. Moreover, there are few convenient integrated tools provided for deploying the graph algorithms on MIC architecture. In this paper, we propose a high-efficiency framework MicRun, which is flexible to be used for graph algorithms on SIMD architecture of the Xeon Phi. There are two key components in MicRun, the Bucket Grouping module and Auto-tuning module. In the Grouping module, an optimization algorithm is designed for splitting graph tiles into conflict-free groups, which can be directly processed on SIMD parallelism. In the Auto-tuning module, a novel strategy is proposed for optimizing the tile size to boost execution efficiency of the graph computation. MicRun currently supports Bellman-Ford and PageRank algorithms, we also conduct extensive validation experiments on MicRun. Experimental results show that MicRun outperforms existing mechanisms in terms of storage and time overhead. As a consequence, both graph algorithms achieve an average speedup of 1.1× by MicRun, compared with the state-of-the-art.","PeriodicalId":405953,"journal":{"name":"2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"MicRun: A framework for scale-free graph algorithms on SIMD architecture of the Xeon Phi\",\"authors\":\"Jie Lin, Q. Wu, Yusong Tan, Jie Yu, Qi Zhang, Xiaoling Li, Lei Luo\",\"doi\":\"10.1109/ASAP.2017.7995269\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph algorithms currently play increasingly important roles, especially in social networks and language modeling scenarios. Recently, accelerating graph algorithms by heterogeneous high performance computers with the integrated cores and expanded SIMD lanes has been becoming the mainstream. However, the existing methods, restricted by the low-efficiency grouping strategy and the non-optimized selection mechanism of tile size of a graph, are far below our expectations in many ways. Moreover, there are few convenient integrated tools provided for deploying the graph algorithms on MIC architecture. In this paper, we propose a high-efficiency framework MicRun, which is flexible to be used for graph algorithms on SIMD architecture of the Xeon Phi. There are two key components in MicRun, the Bucket Grouping module and Auto-tuning module. In the Grouping module, an optimization algorithm is designed for splitting graph tiles into conflict-free groups, which can be directly processed on SIMD parallelism. In the Auto-tuning module, a novel strategy is proposed for optimizing the tile size to boost execution efficiency of the graph computation. MicRun currently supports Bellman-Ford and PageRank algorithms, we also conduct extensive validation experiments on MicRun. Experimental results show that MicRun outperforms existing mechanisms in terms of storage and time overhead. As a consequence, both graph algorithms achieve an average speedup of 1.1× by MicRun, compared with the state-of-the-art.\",\"PeriodicalId\":405953,\"journal\":{\"name\":\"2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASAP.2017.7995269\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASAP.2017.7995269","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

图算法目前发挥着越来越重要的作用，特别是在社交网络和语言建模场景中。近年来，采用集成核心和扩展SIMD通道的异构高性能计算机加速图形算法已成为主流。然而，现有的方法受到效率低下的分组策略和图块大小的非优化选择机制的限制，在很多方面远远低于我们的预期。此外，很少有方便的集成工具来部署图算法在MIC架构上。在本文中，我们提出了一个高效的框架MicRun，它可以灵活地用于Xeon Phi处理器的SIMD架构上的图形算法。MicRun中有两个关键组件，桶分组模块和自动调优模块。在Grouping模块中，设计了一种优化算法，将图块分割成无冲突的组，这些组可以直接在SIMD并行上进行处理。在自调优模块中，提出了一种优化图块大小的新策略，以提高图计算的执行效率。MicRun目前支持Bellman-Ford算法和PageRank算法，我们也在MicRun上进行了大量的验证实验。实验结果表明，MicRun在存储和时间开销方面优于现有机制。因此，与最先进的算法相比，MicRun的两种图形算法的平均加速都达到了1.1倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MicRun: A framework for scale-free graph algorithms on SIMD architecture of the Xeon Phi

Graph algorithms currently play increasingly important roles, especially in social networks and language modeling scenarios. Recently, accelerating graph algorithms by heterogeneous high performance computers with the integrated cores and expanded SIMD lanes has been becoming the mainstream. However, the existing methods, restricted by the low-efficiency grouping strategy and the non-optimized selection mechanism of tile size of a graph, are far below our expectations in many ways. Moreover, there are few convenient integrated tools provided for deploying the graph algorithms on MIC architecture. In this paper, we propose a high-efficiency framework MicRun, which is flexible to be used for graph algorithms on SIMD architecture of the Xeon Phi. There are two key components in MicRun, the Bucket Grouping module and Auto-tuning module. In the Grouping module, an optimization algorithm is designed for splitting graph tiles into conflict-free groups, which can be directly processed on SIMD parallelism. In the Auto-tuning module, a novel strategy is proposed for optimizing the tile size to boost execution efficiency of the graph computation. MicRun currently supports Bellman-Ford and PageRank algorithms, we also conduct extensive validation experiments on MicRun. Experimental results show that MicRun outperforms existing mechanisms in terms of storage and time overhead. As a consequence, both graph algorithms achieve an average speedup of 1.1× by MicRun, compared with the state-of-the-art.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP)

自引率

0.00%

发文量