用于并行化部分并行循环的高效运行时调度

Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing Pub Date : 1997-12-10 DOI:10.1109/ICAPP.1997.651508

Tsung-Chuan Huang, Po-Hsueh Hsu, Tze-Nan Sheng

{"title":"用于并行化部分并行循环的高效运行时调度","authors":"Tsung-Chuan Huang, Po-Hsueh Hsu, Tze-Nan Sheng","doi":"10.1109/ICAPP.1997.651508","DOIUrl":null,"url":null,"abstract":"We propose an efficient run-time technique to find an optimal parallel execution schedule for partially parallel loops in which synchronizations between iterations are needed to ensure correct program semantics. For efficiency, we combine conventional mark phase and scheduler phase into a single parallel scheduler. The scheduler divides the loop iterations into several chunks then executes the iterations in one chunk in parallel. Our scheme not only runs fast but also achieves an optimal schedule. In addition, an atomic bit-vector operation is introduced to avoid global synchronization overhead and ensure the larger wavefront number is kept when the wavefront number of an iteration will be concurrently updated during scheduling.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Efficient run-time scheduling for parallelizing partially parallel loops\",\"authors\":\"Tsung-Chuan Huang, Po-Hsueh Hsu, Tze-Nan Sheng\",\"doi\":\"10.1109/ICAPP.1997.651508\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose an efficient run-time technique to find an optimal parallel execution schedule for partially parallel loops in which synchronizations between iterations are needed to ensure correct program semantics. For efficiency, we combine conventional mark phase and scheduler phase into a single parallel scheduler. The scheduler divides the loop iterations into several chunks then executes the iterations in one chunk in parallel. Our scheme not only runs fast but also achieves an optimal schedule. In addition, an atomic bit-vector operation is introduced to avoid global synchronization overhead and ensure the larger wavefront number is kept when the wavefront number of an iteration will be concurrently updated during scheduling.\",\"PeriodicalId\":325978,\"journal\":{\"name\":\"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-12-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAPP.1997.651508\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAPP.1997.651508","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

我们提出了一种有效的运行时技术来为部分并行循环找到最佳并行执行计划，其中迭代之间需要同步以确保正确的程序语义。为了提高效率，我们将传统的标记阶段和调度阶段合并为一个并行调度程序。调度器将循环迭代划分为几个块，然后并行地在一个块中执行迭代。该方案不仅运行速度快，而且实现了最优调度。此外，还引入了原子位向量操作，避免了全局同步开销，并确保在调度过程中迭代的波前数同时更新时保持较大的波前数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Efficient run-time scheduling for parallelizing partially parallel loops

We propose an efficient run-time technique to find an optimal parallel execution schedule for partially parallel loops in which synchronizations between iterations are needed to ensure correct program semantics. For efficiency, we combine conventional mark phase and scheduler phase into a single parallel scheduler. The scheduler divides the loop iterations into several chunks then executes the iterations in one chunk in parallel. Our scheme not only runs fast but also achieves an optimal schedule. In addition, an atomic bit-vector operation is introduced to avoid global synchronization overhead and ensure the larger wavefront number is kept when the wavefront number of an iteration will be concurrently updated during scheduling.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing

自引率

0.00%

发文量