{"title":"Accelerated Work Stealing","authors":"D. B. Larkins, John Snyder, James Dinan","doi":"10.1145/3337821.3337878","DOIUrl":null,"url":null,"abstract":"Realizing scalable performance with irregular parallel applications is challenging on large-scale distributed memory clusters. These applications typically require continuous, dynamic load balancing to maintain efficiency. Work stealing is a common approach to dynamic distributed load balancing. However its use in conjunction with advanced network offload capabilities is not well understood. We present a distributed work-stealing system that is amenable to acceleration using the Portals 4 network programming interface. Our work shows that the structures provided by Portals to handle two-sided communication are general-purpose and can accelerate work stealing. We demonstrate the effectiveness of this approach using known benchmarks from computational chemistry and for performing unbalanced tree searches. Results show that Portals accelerated work-stealing can greatly reduce communication overhead, task acquisition time, and termination detection.","PeriodicalId":405273,"journal":{"name":"Proceedings of the 48th International Conference on Parallel Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 48th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3337821.3337878","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Realizing scalable performance with irregular parallel applications is challenging on large-scale distributed memory clusters. These applications typically require continuous, dynamic load balancing to maintain efficiency. Work stealing is a common approach to dynamic distributed load balancing. However its use in conjunction with advanced network offload capabilities is not well understood. We present a distributed work-stealing system that is amenable to acceleration using the Portals 4 network programming interface. Our work shows that the structures provided by Portals to handle two-sided communication are general-purpose and can accelerate work stealing. We demonstrate the effectiveness of this approach using known benchmarks from computational chemistry and for performing unbalanced tree searches. Results show that Portals accelerated work-stealing can greatly reduce communication overhead, task acquisition time, and termination detection.