{"title":"Optimization via Reflection on Work Stealing in TBB","authors":"A. Robison, Michael J. Voss, Alexey Kukanov","doi":"10.1109/IPDPS.2008.4536188","DOIUrl":null,"url":null,"abstract":"Intelreg Threading Building Blocks (Intelreg TBB) is a C++ library for parallel programming. Its templates for generic parallel loops are built upon nested parallelism and a work-stealing scheduler. This paper discusses optimizations where the high-level algorithm inspects or biases stealing. Two optimizations are discussed in detail. The first dynamically optimizes grain size based on observed stealing. The second improves prior work that exploits cache locality by biased stealing. This paper shows that in a task stealing environment, deferring task spawning can improve performance in some contexts. Performance results for simple kernels are presented.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"os-28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"98","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Symposium on Parallel and Distributed Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2008.4536188","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 98
Abstract
Intelreg Threading Building Blocks (Intelreg TBB) is a C++ library for parallel programming. Its templates for generic parallel loops are built upon nested parallelism and a work-stealing scheduler. This paper discusses optimizations where the high-level algorithm inspects or biases stealing. Two optimizations are discussed in detail. The first dynamically optimizes grain size based on observed stealing. The second improves prior work that exploits cache locality by biased stealing. This paper shows that in a task stealing environment, deferring task spawning can improve performance in some contexts. Performance results for simple kernels are presented.