M. Grossman, H. Pritchard, Zoran Budimlic, Vivek Sarkar
{"title":"图500关于OpenSHMEM:使用对过去工作的实际调查来激励新的算法开发","authors":"M. Grossman, H. Pritchard, Zoran Budimlic, Vivek Sarkar","doi":"10.1145/3144779.3144781","DOIUrl":null,"url":null,"abstract":"Graph500 is an open specification of a graph-based benchmark for high-performance computing (HPC). The core computational kernel of Graph500 is a breadth-first search of an undirected graph. Unlike many other HPC benchmarks, Graph500 is therefore characterized by heavily irregular and fine-grain computation, memory accesses, and network communication. Therefore, it can serve as a more realistic stress test of modern HPC hardware, software, and algorithmic techniques than other benchmarking efforts. On the other hand, OpenSHMEM is an open, PGAS, and SPMD specification of a communication model for communicating across large numbers of processing elements. OpenSHMEM explicitly focuses on applications characterized by fine-grain communication, of which Graph500 is one example. Therefore, there is a natural synergy between the communication patterns of Graph500 and the capabilities of OpenSHMEM. In this work we explore that synergy by developing several novel implementations of Graph500 on various OpenSHMEM implementations. We contribute a review of the state-of-the-art in distributed Graph500 implementations, as well as a performance and programmability comparison between the state-of-the-art and our own OpenSHMEM-based implementations. Our results demonstrate improved scaling of Graph500's BFS kernel out to 1,024 nodes of the Edison supercomputer, achieving ~2.5x performance improvement relative to the highest performing reference implementation at that scale.","PeriodicalId":369424,"journal":{"name":"Proceedings of the Second Annual PGAS Applications Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Graph500 on OpenSHMEM: Using A Practical Survey of Past Work to Motivate Novel Algorithmic Developments\",\"authors\":\"M. Grossman, H. Pritchard, Zoran Budimlic, Vivek Sarkar\",\"doi\":\"10.1145/3144779.3144781\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph500 is an open specification of a graph-based benchmark for high-performance computing (HPC). The core computational kernel of Graph500 is a breadth-first search of an undirected graph. Unlike many other HPC benchmarks, Graph500 is therefore characterized by heavily irregular and fine-grain computation, memory accesses, and network communication. Therefore, it can serve as a more realistic stress test of modern HPC hardware, software, and algorithmic techniques than other benchmarking efforts. On the other hand, OpenSHMEM is an open, PGAS, and SPMD specification of a communication model for communicating across large numbers of processing elements. OpenSHMEM explicitly focuses on applications characterized by fine-grain communication, of which Graph500 is one example. Therefore, there is a natural synergy between the communication patterns of Graph500 and the capabilities of OpenSHMEM. In this work we explore that synergy by developing several novel implementations of Graph500 on various OpenSHMEM implementations. We contribute a review of the state-of-the-art in distributed Graph500 implementations, as well as a performance and programmability comparison between the state-of-the-art and our own OpenSHMEM-based implementations. Our results demonstrate improved scaling of Graph500's BFS kernel out to 1,024 nodes of the Edison supercomputer, achieving ~2.5x performance improvement relative to the highest performing reference implementation at that scale.\",\"PeriodicalId\":369424,\"journal\":{\"name\":\"Proceedings of the Second Annual PGAS Applications Workshop\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Second Annual PGAS Applications Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3144779.3144781\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Second Annual PGAS Applications Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3144779.3144781","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Graph500 on OpenSHMEM: Using A Practical Survey of Past Work to Motivate Novel Algorithmic Developments
Graph500 is an open specification of a graph-based benchmark for high-performance computing (HPC). The core computational kernel of Graph500 is a breadth-first search of an undirected graph. Unlike many other HPC benchmarks, Graph500 is therefore characterized by heavily irregular and fine-grain computation, memory accesses, and network communication. Therefore, it can serve as a more realistic stress test of modern HPC hardware, software, and algorithmic techniques than other benchmarking efforts. On the other hand, OpenSHMEM is an open, PGAS, and SPMD specification of a communication model for communicating across large numbers of processing elements. OpenSHMEM explicitly focuses on applications characterized by fine-grain communication, of which Graph500 is one example. Therefore, there is a natural synergy between the communication patterns of Graph500 and the capabilities of OpenSHMEM. In this work we explore that synergy by developing several novel implementations of Graph500 on various OpenSHMEM implementations. We contribute a review of the state-of-the-art in distributed Graph500 implementations, as well as a performance and programmability comparison between the state-of-the-art and our own OpenSHMEM-based implementations. Our results demonstrate improved scaling of Graph500's BFS kernel out to 1,024 nodes of the Edison supercomputer, achieving ~2.5x performance improvement relative to the highest performing reference implementation at that scale.