{"title":"基于TBB偷工反思的优化","authors":"A. Robison, Michael J. Voss, Alexey Kukanov","doi":"10.1109/IPDPS.2008.4536188","DOIUrl":null,"url":null,"abstract":"Intelreg Threading Building Blocks (Intelreg TBB) is a C++ library for parallel programming. Its templates for generic parallel loops are built upon nested parallelism and a work-stealing scheduler. This paper discusses optimizations where the high-level algorithm inspects or biases stealing. Two optimizations are discussed in detail. The first dynamically optimizes grain size based on observed stealing. The second improves prior work that exploits cache locality by biased stealing. This paper shows that in a task stealing environment, deferring task spawning can improve performance in some contexts. Performance results for simple kernels are presented.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"os-28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"98","resultStr":"{\"title\":\"Optimization via Reflection on Work Stealing in TBB\",\"authors\":\"A. Robison, Michael J. Voss, Alexey Kukanov\",\"doi\":\"10.1109/IPDPS.2008.4536188\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Intelreg Threading Building Blocks (Intelreg TBB) is a C++ library for parallel programming. Its templates for generic parallel loops are built upon nested parallelism and a work-stealing scheduler. This paper discusses optimizations where the high-level algorithm inspects or biases stealing. Two optimizations are discussed in detail. The first dynamically optimizes grain size based on observed stealing. The second improves prior work that exploits cache locality by biased stealing. This paper shows that in a task stealing environment, deferring task spawning can improve performance in some contexts. Performance results for simple kernels are presented.\",\"PeriodicalId\":162608,\"journal\":{\"name\":\"2008 IEEE International Symposium on Parallel and Distributed Processing\",\"volume\":\"os-28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"98\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE International Symposium on Parallel and Distributed Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPS.2008.4536188\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Symposium on Parallel and Distributed Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2008.4536188","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 98
摘要
Intelreg Threading Building Blocks (Intelreg TBB)是一个并行编程的c++库。它的通用并行循环模板建立在嵌套并行性和偷取工作的调度器之上。本文讨论了高级算法检查或偏差窃取的优化。详细讨论了两种优化方法。第一种是基于观察到的偷窃动态优化粒度。第二种方法改进了先前的工作,即通过有偏差的窃取来利用缓存局部性。本文表明,在任务窃取环境下,延迟任务生成可以在某些情况下提高性能。给出了简单核的性能结果。
Optimization via Reflection on Work Stealing in TBB
Intelreg Threading Building Blocks (Intelreg TBB) is a C++ library for parallel programming. Its templates for generic parallel loops are built upon nested parallelism and a work-stealing scheduler. This paper discusses optimizations where the high-level algorithm inspects or biases stealing. Two optimizations are discussed in detail. The first dynamically optimizes grain size based on observed stealing. The second improves prior work that exploits cache locality by biased stealing. This paper shows that in a task stealing environment, deferring task spawning can improve performance in some contexts. Performance results for simple kernels are presented.