基于广度优先搜索的共享内存计算机数据访问优化

2015 14th International Symposium on Parallel and Distributed Computing Pub Date : 2015-06-29 DOI:10.1109/ISPDC.2015.25

Z. Hu, Huashan Yu

{"title":"基于广度优先搜索的共享内存计算机数据访问优化","authors":"Z. Hu, Huashan Yu","doi":"10.1109/ISPDC.2015.25","DOIUrl":null,"url":null,"abstract":"Breadth-first search (BFS) is a widely used graph algorithm. It is data-intensive, and the data accesses are random and discontinuous. The data-accessing latency plays an important role in the algorithm's time consumption on shared memory computers, since it can hardly be reduced with processor technologies like dynamic execution of instructions and prefect of data. This work focuses on partitioning computation for BFS on shared memory computers. The goal is to improve data-accessing efficiency and optimize load balance among processors. A data-centric parallel computing model is presented. The model provides a partitioned and hierarchical data-view for each processor, and automatically assigns the computation on each data partition to a set of processors that have same data-view. This computation partitioning mechanism allows applications to minimize data accessing collisions among processors. A BFS equipped with the data-centric computation partitioning mechanism has been implemented. Two strategies are introduced to improve our BFS's performance further. One is to improve vertex -- accessing efficiency by representing status of vertices with bitmap. Another is to improve load balance by adjusting every processor's workload dynamically. The model and the strategies have been evaluated with both real graphs and synthetic graphs. Comparing with the BFS without the data-centric computation partitioning mechanism, the new BFS has achieved 1.8-2.6× speedup. We believe this mechanism is also applicable to other graph applications.","PeriodicalId":123757,"journal":{"name":"2015 14th International Symposium on Parallel and Distributed Computing","volume":"323 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimizing Data Accesses for Breadth-First Search on Shared Memory Computers\",\"authors\":\"Z. Hu, Huashan Yu\",\"doi\":\"10.1109/ISPDC.2015.25\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Breadth-first search (BFS) is a widely used graph algorithm. It is data-intensive, and the data accesses are random and discontinuous. The data-accessing latency plays an important role in the algorithm's time consumption on shared memory computers, since it can hardly be reduced with processor technologies like dynamic execution of instructions and prefect of data. This work focuses on partitioning computation for BFS on shared memory computers. The goal is to improve data-accessing efficiency and optimize load balance among processors. A data-centric parallel computing model is presented. The model provides a partitioned and hierarchical data-view for each processor, and automatically assigns the computation on each data partition to a set of processors that have same data-view. This computation partitioning mechanism allows applications to minimize data accessing collisions among processors. A BFS equipped with the data-centric computation partitioning mechanism has been implemented. Two strategies are introduced to improve our BFS's performance further. One is to improve vertex -- accessing efficiency by representing status of vertices with bitmap. Another is to improve load balance by adjusting every processor's workload dynamically. The model and the strategies have been evaluated with both real graphs and synthetic graphs. Comparing with the BFS without the data-centric computation partitioning mechanism, the new BFS has achieved 1.8-2.6× speedup. We believe this mechanism is also applicable to other graph applications.\",\"PeriodicalId\":123757,\"journal\":{\"name\":\"2015 14th International Symposium on Parallel and Distributed Computing\",\"volume\":\"323 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 14th International Symposium on Parallel and Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPDC.2015.25\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 14th International Symposium on Parallel and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPDC.2015.25","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

广度优先搜索(BFS)是一种应用广泛的图算法。它是数据密集型的，数据访问是随机和不连续的。在共享内存计算机上，数据访问延迟在算法的耗时中起着重要的作用，因为它很难通过指令的动态执行和数据的级配等处理器技术来降低。本文主要研究了共享内存计算机上BFS的分区计算。目标是提高数据访问效率并优化处理器之间的负载平衡。提出了一种以数据为中心的并行计算模型。该模型为每个处理器提供了分区和分层的数据视图，并自动将每个数据分区上的计算分配给具有相同数据视图的一组处理器。这种计算分区机制允许应用程序最小化处理器之间的数据访问冲突。实现了一个以数据为中心的计算分区机制的BFS。介绍了两种策略来进一步提高BFS的性能。一是通过用位图表示顶点的状态来提高顶点访问效率。另一种方法是通过动态调整每个处理器的工作负载来改善负载平衡。用实图和合成图对模型和策略进行了评价。与未采用以数据为中心的计算分区机制的BFS相比，新BFS的速度提高了1.8-2.6倍。我们相信这种机制也适用于其他图形应用程序。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Optimizing Data Accesses for Breadth-First Search on Shared Memory Computers

Breadth-first search (BFS) is a widely used graph algorithm. It is data-intensive, and the data accesses are random and discontinuous. The data-accessing latency plays an important role in the algorithm's time consumption on shared memory computers, since it can hardly be reduced with processor technologies like dynamic execution of instructions and prefect of data. This work focuses on partitioning computation for BFS on shared memory computers. The goal is to improve data-accessing efficiency and optimize load balance among processors. A data-centric parallel computing model is presented. The model provides a partitioned and hierarchical data-view for each processor, and automatically assigns the computation on each data partition to a set of processors that have same data-view. This computation partitioning mechanism allows applications to minimize data accessing collisions among processors. A BFS equipped with the data-centric computation partitioning mechanism has been implemented. Two strategies are introduced to improve our BFS's performance further. One is to improve vertex -- accessing efficiency by representing status of vertices with bitmap. Another is to improve load balance by adjusting every processor's workload dynamically. The model and the strategies have been evaluated with both real graphs and synthetic graphs. Comparing with the BFS without the data-centric computation partitioning mechanism, the new BFS has achieved 1.8-2.6× speedup. We believe this mechanism is also applicable to other graph applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 14th International Symposium on Parallel and Distributed Computing

自引率

0.00%

发文量