基于日立 SR2201 滑动寄存器窗口架构的不规则细粒度并行计算

Proceedings. International Conference on Parallel Computing in Electrical Engineering Pub Date : 2002-09-22 DOI:10.1109/PCEE.2002.1115194

A. Smyk, M. Tudruj

{"title":"基于日立 SR2201 滑动寄存器窗口架构的不规则细粒度并行计算","authors":"A. Smyk, M. Tudruj","doi":"10.1109/PCEE.2002.1115194","DOIUrl":null,"url":null,"abstract":"In this article, an optimization method for parallelized execution of irregular fine grain computations is presented. This method was implemented using pseudo-vector processing (PVP) and sliding window register (SWR) mechanisms, which have been provided in Hitachi SR2201 supercomputer. The general idea of PVP and SWR relies on optimizing access to big continuous parts of memory and parallel execution of three kinds of operations placed in loops: loading and storing data, arithmetic operations. The main disadvantage of the above-mentioned mechanisms are that gain can be obtained only for long loops and regular expressions inside them. In our method, we focused attention on irregular computations, devoid of any predictable dependencies. We divided a given code into parts and manually optimized relations between loading and storing operations with taking into consideration the memory latency and delays in accessing needed data. In our implementation we obtained a speedup by using a simple reordering of sequences access operations to registers and memory.","PeriodicalId":444003,"journal":{"name":"Proceedings. International Conference on Parallel Computing in Electrical Engineering","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Irregular fine-grain parallel computing based on the slide register window architecture of Hitachi SR2201\",\"authors\":\"A. Smyk, M. Tudruj\",\"doi\":\"10.1109/PCEE.2002.1115194\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this article, an optimization method for parallelized execution of irregular fine grain computations is presented. This method was implemented using pseudo-vector processing (PVP) and sliding window register (SWR) mechanisms, which have been provided in Hitachi SR2201 supercomputer. The general idea of PVP and SWR relies on optimizing access to big continuous parts of memory and parallel execution of three kinds of operations placed in loops: loading and storing data, arithmetic operations. The main disadvantage of the above-mentioned mechanisms are that gain can be obtained only for long loops and regular expressions inside them. In our method, we focused attention on irregular computations, devoid of any predictable dependencies. We divided a given code into parts and manually optimized relations between loading and storing operations with taking into consideration the memory latency and delays in accessing needed data. In our implementation we obtained a speedup by using a simple reordering of sequences access operations to registers and memory.\",\"PeriodicalId\":444003,\"journal\":{\"name\":\"Proceedings. International Conference on Parallel Computing in Electrical Engineering\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. International Conference on Parallel Computing in Electrical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PCEE.2002.1115194\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Parallel Computing in Electrical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PCEE.2002.1115194","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文介绍了一种并行执行不规则细粒度计算的优化方法。该方法是利用日立 SR2201 超级计算机提供的伪向量处理（PVP）和滑动窗口寄存器（SWR）机制实现的。PVP 和 SWR 的总体思想依赖于优化对内存大连续部分的访问，以及并行执行循环中的三种操作：加载和存储数据、算术操作。上述机制的主要缺点是，只有长循环和循环中的正则表达式才能获得收益。在我们的方法中，我们将注意力集中在没有任何可预测依赖关系的不规则计算上。我们将给定的代码分成若干部分，并手动优化了加载和存储操作之间的关系，同时考虑到内存延迟和访问所需数据的延迟。在我们的实施过程中，通过对寄存器和内存的访问操作顺序进行简单的重新排序，我们提高了速度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Irregular fine-grain parallel computing based on the slide register window architecture of Hitachi SR2201

In this article, an optimization method for parallelized execution of irregular fine grain computations is presented. This method was implemented using pseudo-vector processing (PVP) and sliding window register (SWR) mechanisms, which have been provided in Hitachi SR2201 supercomputer. The general idea of PVP and SWR relies on optimizing access to big continuous parts of memory and parallel execution of three kinds of operations placed in loops: loading and storing data, arithmetic operations. The main disadvantage of the above-mentioned mechanisms are that gain can be obtained only for long loops and regular expressions inside them. In our method, we focused attention on irregular computations, devoid of any predictable dependencies. We divided a given code into parts and manually optimized relations between loading and storing operations with taking into consideration the memory latency and delays in accessing needed data. In our implementation we obtained a speedup by using a simple reordering of sequences access operations to registers and memory.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings. International Conference on Parallel Computing in Electrical Engineering

自引率

0.00%

发文量