Greater performance and better efficiency: Predicated execution has shown us the way

2016 International Conference on Parallel Architecture and Compilation Techniques (PACT) Pub Date : 2016-09-11 DOI:10.1145/2967938.2970376

Y. Patt

{"title":"Greater performance and better efficiency: Predicated execution has shown us the way","authors":"Y. Patt","doi":"10.1145/2967938.2970376","DOIUrl":null,"url":null,"abstract":"We have been riding a strong wave of greater and greater performance for decades, to some extent due to the combination of Moore's Law and Dennard scaling. But we are told all this is coming to an end, in part because we cannot continue to double the transistor count on the chip and we cannot run these things at higher and higher frequencies. Much of the silliness promised by multicore is just that, and not the answer. So, what are we to do? It turns out predication gave us the answer more than 30 years ago. Most of us were not paying attention. Today we have no choice. Predication happened because the compiler, the ISA, and the microarchitecture all cooperated so it could happen. That meant breaking the artificial walls in the transformation hierarchy. If we accept this as something we have to do, there are plenty of opportunities (a) for increased performance (attacking latency instead of just multicore bandwidth) and (b) for better energy efficiency. In this talk I hope to point out some of them, and then ask the obvious question: What do we need to do to make this happen?","PeriodicalId":407717,"journal":{"name":"2016 International Conference on Parallel Architecture and Compilation Techniques (PACT)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Parallel Architecture and Compilation Techniques (PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2967938.2970376","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

We have been riding a strong wave of greater and greater performance for decades, to some extent due to the combination of Moore's Law and Dennard scaling. But we are told all this is coming to an end, in part because we cannot continue to double the transistor count on the chip and we cannot run these things at higher and higher frequencies. Much of the silliness promised by multicore is just that, and not the answer. So, what are we to do? It turns out predication gave us the answer more than 30 years ago. Most of us were not paying attention. Today we have no choice. Predication happened because the compiler, the ISA, and the microarchitecture all cooperated so it could happen. That meant breaking the artificial walls in the transformation hierarchy. If we accept this as something we have to do, there are plenty of opportunities (a) for increased performance (attacking latency instead of just multicore bandwidth) and (b) for better energy efficiency. In this talk I hope to point out some of them, and then ask the obvious question: What do we need to do to make this happen?

查看原文本刊更多论文

更高的性能和更高的效率:预测执行为我们指明了方向

几十年来，我们的性能一直在不断提高，这在一定程度上要归功于摩尔定律和登纳德缩放法的结合。但我们被告知这一切即将结束，部分原因是我们不能继续将芯片上的晶体管数量翻倍，我们不能以越来越高的频率运行这些东西。多核承诺的许多愚蠢之处只是如此，而不是答案。那么，我们该怎么办呢?事实证明，早在30多年前，预测就给了我们答案。我们大多数人都没有注意到。今天我们别无选择。预测的发生是因为编译器、ISA和微体系结构都协同工作，所以预测才有可能发生。这意味着打破转换层次结构中的人造墙。如果我们接受这是我们必须做的事情，就有很多机会(a)提高性能(攻击延迟而不仅仅是多核带宽)和(b)提高能源效率。在这次演讲中，我希望指出其中的一些，然后问一个显而易见的问题:我们需要做些什么来实现这一目标?

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 International Conference on Parallel Architecture and Compilation Techniques (PACT)

自引率

0.00%

发文量