GTS:一种基于gpu流拓扑的快速可扩展图形处理方法

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-14 DOI:10.1145/2882903.2915204

Min-Soo Kim, K. An, Himchan Park, Hyunseok Seo, Jinwook Kim

{"title":"GTS:一种基于gpu流拓扑的快速可扩展图形处理方法","authors":"Min-Soo Kim, K. An, Himchan Park, Hyunseok Seo, Jinwook Kim","doi":"10.1145/2882903.2915204","DOIUrl":null,"url":null,"abstract":"A fast and scalable graph processing method becomes increasingly important as graphs become popular in a wide range of applications and their sizes are growing rapidly. Most of distributed graph processing methods require a lot of machines equipped with a total of thousands of CPU cores and a few terabyte main memory for handling billion-scale graphs. Meanwhile, GPUs could be a promising direction toward fast processing of large-scale graphs by exploiting thousands of GPU cores. All of the existing methods using GPUs, however, fail to process large-scale graphs that do not fit in main memory of a single machine. Here, we propose a fast and scalable graph processing method GTS that handles even RMAT32 (64 billion edges) very efficiently only by using a single machine. The proposed method stores graphs in PCI-E SSDs and executes a graph algorithm using thousands of GPU cores while streaming topology data of graphs to GPUs via PCI-E interface. GTS is fast due to no communication overhead and scalable due to no data duplication from graph partitioning among machines. Through extensive experiments, we show that GTS consistently and significantly outperforms the major distributed graph processing methods, GraphX, Giraph, and PowerGraph, and the state-of-the-art GPU-based method TOTEM.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"137 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"69","resultStr":"{\"title\":\"GTS: A Fast and Scalable Graph Processing Method based on Streaming Topology to GPUs\",\"authors\":\"Min-Soo Kim, K. An, Himchan Park, Hyunseok Seo, Jinwook Kim\",\"doi\":\"10.1145/2882903.2915204\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A fast and scalable graph processing method becomes increasingly important as graphs become popular in a wide range of applications and their sizes are growing rapidly. Most of distributed graph processing methods require a lot of machines equipped with a total of thousands of CPU cores and a few terabyte main memory for handling billion-scale graphs. Meanwhile, GPUs could be a promising direction toward fast processing of large-scale graphs by exploiting thousands of GPU cores. All of the existing methods using GPUs, however, fail to process large-scale graphs that do not fit in main memory of a single machine. Here, we propose a fast and scalable graph processing method GTS that handles even RMAT32 (64 billion edges) very efficiently only by using a single machine. The proposed method stores graphs in PCI-E SSDs and executes a graph algorithm using thousands of GPU cores while streaming topology data of graphs to GPUs via PCI-E interface. GTS is fast due to no communication overhead and scalable due to no data duplication from graph partitioning among machines. Through extensive experiments, we show that GTS consistently and significantly outperforms the major distributed graph processing methods, GraphX, Giraph, and PowerGraph, and the state-of-the-art GPU-based method TOTEM.\",\"PeriodicalId\":20483,\"journal\":{\"name\":\"Proceedings of the 2016 International Conference on Management of Data\",\"volume\":\"137 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"69\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2016 International Conference on Management of Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2882903.2915204\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2882903.2915204","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 69

摘要

随着图形在广泛的应用中越来越流行，图形的大小也在迅速增长，一种快速、可扩展的图形处理方法变得越来越重要。大多数分布式图形处理方法需要大量的机器，总共配备数千个CPU内核和几tb的主内存来处理十亿规模的图形。同时，GPU可以利用数千个GPU内核来快速处理大规模图形，这是一个很有前途的方向。然而，所有使用gpu的现有方法都无法处理不适合单个机器主存储器的大规模图形。在这里，我们提出了一种快速且可扩展的图形处理方法GTS，该方法仅使用一台机器就可以非常有效地处理RMAT32(640亿个边)。该方法将图形存储在PCI-E固态硬盘中，并使用数千个GPU内核执行图形算法，同时通过PCI-E接口将图形拓扑数据流式传输到GPU。GTS由于没有通信开销而速度很快，并且由于机器之间的图分区没有数据复制而具有可伸缩性。通过广泛的实验，我们表明GTS持续且显著地优于主要的分布式图形处理方法，GraphX, Giraph和PowerGraph，以及最先进的基于gpu的方法TOTEM。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

GTS: A Fast and Scalable Graph Processing Method based on Streaming Topology to GPUs

A fast and scalable graph processing method becomes increasingly important as graphs become popular in a wide range of applications and their sizes are growing rapidly. Most of distributed graph processing methods require a lot of machines equipped with a total of thousands of CPU cores and a few terabyte main memory for handling billion-scale graphs. Meanwhile, GPUs could be a promising direction toward fast processing of large-scale graphs by exploiting thousands of GPU cores. All of the existing methods using GPUs, however, fail to process large-scale graphs that do not fit in main memory of a single machine. Here, we propose a fast and scalable graph processing method GTS that handles even RMAT32 (64 billion edges) very efficiently only by using a single machine. The proposed method stores graphs in PCI-E SSDs and executes a graph algorithm using thousands of GPU cores while streaming topology data of graphs to GPUs via PCI-E interface. GTS is fast due to no communication overhead and scalable due to no data duplication from graph partitioning among machines. Through extensive experiments, we show that GTS consistently and significantly outperforms the major distributed graph processing methods, GraphX, Giraph, and PowerGraph, and the state-of-the-art GPU-based method TOTEM.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2016 International Conference on Management of Data

自引率

0.00%

发文量