Greater performance and better efficiency: Predicated execution has shown us the way

Y. Patt
{"title":"Greater performance and better efficiency: Predicated execution has shown us the way","authors":"Y. Patt","doi":"10.1145/2967938.2970376","DOIUrl":null,"url":null,"abstract":"We have been riding a strong wave of greater and greater performance for decades, to some extent due to the combination of Moore's Law and Dennard scaling. But we are told all this is coming to an end, in part because we cannot continue to double the transistor count on the chip and we cannot run these things at higher and higher frequencies. Much of the silliness promised by multicore is just that, and not the answer. So, what are we to do? It turns out predication gave us the answer more than 30 years ago. Most of us were not paying attention. Today we have no choice. Predication happened because the compiler, the ISA, and the microarchitecture all cooperated so it could happen. That meant breaking the artificial walls in the transformation hierarchy. If we accept this as something we have to do, there are plenty of opportunities (a) for increased performance (attacking latency instead of just multicore bandwidth) and (b) for better energy efficiency. In this talk I hope to point out some of them, and then ask the obvious question: What do we need to do to make this happen?","PeriodicalId":407717,"journal":{"name":"2016 International Conference on Parallel Architecture and Compilation Techniques (PACT)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Parallel Architecture and Compilation Techniques (PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2967938.2970376","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We have been riding a strong wave of greater and greater performance for decades, to some extent due to the combination of Moore's Law and Dennard scaling. But we are told all this is coming to an end, in part because we cannot continue to double the transistor count on the chip and we cannot run these things at higher and higher frequencies. Much of the silliness promised by multicore is just that, and not the answer. So, what are we to do? It turns out predication gave us the answer more than 30 years ago. Most of us were not paying attention. Today we have no choice. Predication happened because the compiler, the ISA, and the microarchitecture all cooperated so it could happen. That meant breaking the artificial walls in the transformation hierarchy. If we accept this as something we have to do, there are plenty of opportunities (a) for increased performance (attacking latency instead of just multicore bandwidth) and (b) for better energy efficiency. In this talk I hope to point out some of them, and then ask the obvious question: What do we need to do to make this happen?
更高的性能和更高的效率:预测执行为我们指明了方向
几十年来,我们的性能一直在不断提高,这在一定程度上要归功于摩尔定律和登纳德缩放法的结合。但我们被告知这一切即将结束,部分原因是我们不能继续将芯片上的晶体管数量翻倍,我们不能以越来越高的频率运行这些东西。多核承诺的许多愚蠢之处只是如此,而不是答案。那么,我们该怎么办呢?事实证明,早在30多年前,预测就给了我们答案。我们大多数人都没有注意到。今天我们别无选择。预测的发生是因为编译器、ISA和微体系结构都协同工作,所以预测才有可能发生。这意味着打破转换层次结构中的人造墙。如果我们接受这是我们必须做的事情,就有很多机会(a)提高性能(攻击延迟而不仅仅是多核带宽)和(b)提高能源效率。在这次演讲中,我希望指出其中的一些,然后问一个显而易见的问题:我们需要做些什么来实现这一目标?
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信