在现有并行模型中使用任务图执行优化的不规则应用程序

2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI:10.1109/SC.Companion.2012.43

Christopher D. Krieger, M. Strout, J. Roelofs, A. Bajwa

{"title":"在现有并行模型中使用任务图执行优化的不规则应用程序","authors":"Christopher D. Krieger, M. Strout, J. Roelofs, A. Bajwa","doi":"10.1109/SC.Companion.2012.43","DOIUrl":null,"url":null,"abstract":"Many sparse or irregular scientific computations are memory bound and benefit from locality improving optimizations such as blocking or tiling. These optimizations result in asynchronous parallelism that can be represented by arbitrary task graphs. Unfortunately, most popular parallel programming models with the exception of Threading Building Blocks (TBB) do not directly execute arbitrary task graphs. In this paper, we compare the programming and execution of arbitrary task graphs qualitatively and quantitatively in TBB, the OpenMP doall model, the OpenMP 3.0 task model, and Cilk Plus. We present performance and scalability results for 8 and 40 core shared memory systems on a sparse matrix iterative solver and a molecular dynamics benchmark.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"40 1","pages":"261-268"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Executing Optimized Irregular Applications Using Task Graphs within Existing Parallel Models\",\"authors\":\"Christopher D. Krieger, M. Strout, J. Roelofs, A. Bajwa\",\"doi\":\"10.1109/SC.Companion.2012.43\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many sparse or irregular scientific computations are memory bound and benefit from locality improving optimizations such as blocking or tiling. These optimizations result in asynchronous parallelism that can be represented by arbitrary task graphs. Unfortunately, most popular parallel programming models with the exception of Threading Building Blocks (TBB) do not directly execute arbitrary task graphs. In this paper, we compare the programming and execution of arbitrary task graphs qualitatively and quantitatively in TBB, the OpenMP doall model, the OpenMP 3.0 task model, and Cilk Plus. We present performance and scalability results for 8 and 40 core shared memory systems on a sparse matrix iterative solver and a molecular dynamics benchmark.\",\"PeriodicalId\":6346,\"journal\":{\"name\":\"2012 SC Companion: High Performance Computing, Networking Storage and Analysis\",\"volume\":\"40 1\",\"pages\":\"261-268\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 SC Companion: High Performance Computing, Networking Storage and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SC.Companion.2012.43\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC.Companion.2012.43","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

摘要

许多稀疏或不规则的科学计算都是内存受限的，并受益于局部性改进的优化，如阻塞或平铺。这些优化产生异步并行性，可以用任意任务图表示。不幸的是，除了线程构建块(TBB)之外，大多数流行的并行编程模型都不能直接执行任意任务图。本文对TBB、OpenMP doall模型、OpenMP 3.0任务模型和Cilk Plus中任意任务图的编程和执行进行了定性和定量的比较。我们在稀疏矩阵迭代求解器和分子动力学基准上给出了8核和40核共享内存系统的性能和可扩展性结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Executing Optimized Irregular Applications Using Task Graphs within Existing Parallel Models

Many sparse or irregular scientific computations are memory bound and benefit from locality improving optimizations such as blocking or tiling. These optimizations result in asynchronous parallelism that can be represented by arbitrary task graphs. Unfortunately, most popular parallel programming models with the exception of Threading Building Blocks (TBB) do not directly execute arbitrary task graphs. In this paper, we compare the programming and execution of arbitrary task graphs qualitatively and quantitatively in TBB, the OpenMP doall model, the OpenMP 3.0 task model, and Cilk Plus. We present performance and scalability results for 8 and 40 core shared memory systems on a sparse matrix iterative solver and a molecular dynamics benchmark.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

自引率

0.00%

发文量