Using an oracle to measure potential parallelism in single instruction stream programs

MICRO 14 Pub Date : 1981-12-01 DOI:10.1145/1014192.802448

A. Nicolau, J. A. Fisher

{"title":"Using an oracle to measure potential parallelism in single instruction stream programs","authors":"A. Nicolau, J. A. Fisher","doi":"10.1145/1014192.802448","DOIUrl":null,"url":null,"abstract":"Horizontally microprogrammable CPUs belong to a class of machines having statically schedulable parallel instruction execution (SPIE machines). Several experiments have shown that within basic blocks, real code only gives a potential speed-up factor of 2 or 3 when compacted for SPIE machines, even in the presence of unlimited hardware. In this paper, similar experiments are described. However, these measure the potential parallelism available using any global compaction method, that is, one which compacts code beyond block boundaries. Global compaction is a subject of current investigation; no measurements yet exist on implemented systems.\n The approach taken is to first assume that an oracle is available during compaction. This oracle can resolve all dynamic considerations in advance, giving us the ability to find the maximum parallelism available without reformulation of the algorithm. The parallelism found is constrained only by legitimate data dependencies, since questions of conditional jump directions and unresolved indirect memory references are answered by the oracle. Using such an oracle, we find that typical scientific programs may be sped up by anywhere from 3 to 1000 times. These dramatic results provide an upper bound for global compaction techniques. We describe experiments in progress which attempt to limit the oracle progressively, with the aim of eventually producing one which provides only information that may be obtained by a very good compiler. This will give us a more practical measure of the parallelism potentially obtainable via global compaction methods.","PeriodicalId":130913,"journal":{"name":"MICRO 14","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1981-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MICRO 14","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1014192.802448","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 21

Abstract

Horizontally microprogrammable CPUs belong to a class of machines having statically schedulable parallel instruction execution (SPIE machines). Several experiments have shown that within basic blocks, real code only gives a potential speed-up factor of 2 or 3 when compacted for SPIE machines, even in the presence of unlimited hardware. In this paper, similar experiments are described. However, these measure the potential parallelism available using any global compaction method, that is, one which compacts code beyond block boundaries. Global compaction is a subject of current investigation; no measurements yet exist on implemented systems. The approach taken is to first assume that an oracle is available during compaction. This oracle can resolve all dynamic considerations in advance, giving us the ability to find the maximum parallelism available without reformulation of the algorithm. The parallelism found is constrained only by legitimate data dependencies, since questions of conditional jump directions and unresolved indirect memory references are answered by the oracle. Using such an oracle, we find that typical scientific programs may be sped up by anywhere from 3 to 1000 times. These dramatic results provide an upper bound for global compaction techniques. We describe experiments in progress which attempt to limit the oracle progressively, with the aim of eventually producing one which provides only information that may be obtained by a very good compiler. This will give us a more practical measure of the parallelism potentially obtainable via global compaction methods.

查看原文本刊更多论文

使用oracle来测量单指令流程序中潜在的并行性

水平微可编程cpu属于一类具有静态可调度并行指令执行的机器(SPIE机器)。几个实验表明，在基本块中，实际代码在为SPIE机器压缩时只能提供2或3的潜在加速因子，即使在存在无限硬件的情况下也是如此。本文也进行了类似的实验。然而，这些度量使用任何全局压缩方法(即压缩超出块边界的代码的方法)可用的潜在并行性。全局压缩是当前研究的主题;在已实现的系统上还不存在度量。采用的方法是首先假设在压缩期间oracle是可用的。这个oracle可以提前解决所有动态问题，使我们能够在不重新制定算法的情况下找到可用的最大并行性。发现的并行性仅受合法数据依赖的约束，因为oracle会回答条件跳转方向和未解析的间接内存引用的问题。使用这样的神谕，我们发现典型的科学项目可以在任何地方加速3到1000倍。这些戏剧性的结果为全局压缩技术提供了一个上限。我们描述了正在进行的实验，这些实验试图逐步限制oracle，目的是最终产生一个只提供可能由非常好的编译器获得的信息的oracle。这将为我们提供通过全局压缩方法可能获得的更实际的并行性度量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

MICRO 14

自引率

0.00%

发文量