Memory latency-tolerance approaches for Itanium processors: out-of-order execution vs. speculative precomputation

P. Wang, Hong Wang, Jamison D. Collins, Edward T. Grochowski, R. Kling, John Paul Shen
{"title":"Memory latency-tolerance approaches for Itanium processors: out-of-order execution vs. speculative precomputation","authors":"P. Wang, Hong Wang, Jamison D. Collins, Edward T. Grochowski, R. Kling, John Paul Shen","doi":"10.1109/HPCA.2002.995709","DOIUrl":null,"url":null,"abstract":"The performance of in-order execution Itanium/sup TM/ processors can suffer significantly due to cache misses. Two memory latency tolerance approaches can be applied for the Itanium processors. One uses an out-of-order (OOO) execution core; the other assumes multithreading support and exploits cache prefetching via speculative precomputation (SP). This paper evaluates and contrasts these two approaches. In addition, this paper assesses the effectiveness of combining the two approaches. For a select set of memory-intensive programs, an in-order SMT Itanium processor using speculative precomputation can achieve performance improvement (92%) comparable to that of an out-of-order design (87%). Applying both 000 and SP yields a total performance improvement of 141% over the baseline in-order machine. OOO tends to be effective in prefetching-for L1 misses; whereas SP is primarily good at covering L2 and L3 misses. Our analysis indicates that the two approaches can be redundant or complementary depending on the type of delinquent loads that each targets. Both approaches are effective on delinquent loads in the loop body; however only SP is effective on delinquent loads found in loop control code.","PeriodicalId":408620,"journal":{"name":"Proceedings Eighth International Symposium on High Performance Computer Architecture","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Eighth International Symposium on High Performance Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2002.995709","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 39

Abstract

The performance of in-order execution Itanium/sup TM/ processors can suffer significantly due to cache misses. Two memory latency tolerance approaches can be applied for the Itanium processors. One uses an out-of-order (OOO) execution core; the other assumes multithreading support and exploits cache prefetching via speculative precomputation (SP). This paper evaluates and contrasts these two approaches. In addition, this paper assesses the effectiveness of combining the two approaches. For a select set of memory-intensive programs, an in-order SMT Itanium processor using speculative precomputation can achieve performance improvement (92%) comparable to that of an out-of-order design (87%). Applying both 000 and SP yields a total performance improvement of 141% over the baseline in-order machine. OOO tends to be effective in prefetching-for L1 misses; whereas SP is primarily good at covering L2 and L3 misses. Our analysis indicates that the two approaches can be redundant or complementary depending on the type of delinquent loads that each targets. Both approaches are effective on delinquent loads in the loop body; however only SP is effective on delinquent loads found in loop control code.
Itanium处理器的内存延迟容忍方法:乱序执行与推测性预计算
由于缓存丢失,按顺序执行的Itanium/sup TM/处理器的性能可能会受到严重影响。两种内存延迟容忍方法可以应用于Itanium处理器。一个使用乱序(OOO)执行核心;另一种假设支持多线程,并通过推测预计算(SP)利用缓存预取。本文对这两种方法进行了评价和对比。此外,本文还评估了两种方法相结合的有效性。对于一组选定的内存密集型程序,使用推测预计算的有序SMT Itanium处理器可以实现与无序设计(87%)相当的性能改进(92%)。同时应用000和SP,总性能比基准有序机器提高141%。OOO在预取中往往是有效的——对于L1缺失;而SP主要擅长于补上L2和L3失误。我们的分析表明,这两种方法可以是冗余的,也可以是互补的,这取决于每个目标的拖欠负荷的类型。两种方法都能有效地处理环体内的逾期荷载;然而,只有SP对在循环控制代码中发现的不良负载有效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信