A Simple Activation/Deactivation Prefetching Scheme for Chip Multiprocessors

Vicent Selfa, Crispín Gómez Requena, M. E. Gómez, J. Sahuquillo
{"title":"A Simple Activation/Deactivation Prefetching Scheme for Chip Multiprocessors","authors":"Vicent Selfa, Crispín Gómez Requena, M. E. Gómez, J. Sahuquillo","doi":"10.1109/PDP.2016.47","DOIUrl":null,"url":null,"abstract":"Prefetching significantly reduces the memory latencies of a wide range of applications and thus increases the system performance. However, as a speculative technique, prefetching may also noticeably increase the number of memory accesses, which in turns may negatively impact on the main memory bandwidth consumption, performance, and power. Main memory bandwidth consumption is a critical resource especially in the context of current multicore processors since memory requests from all the cores, both prefetch and demand requests, compete among them in the access to the DRAM banks. Consequently, demand requests may be delayed hurting the system performance. This work proposes the Activation/Deactivation Policies (ADP) scheme for hardware prefetchers in multicore processors. This scheme relies on activation policies that turn on the prefetcher on a given core when it is expected that prefetches will improve the performance, and turn off the prefetcher of that core when it is foreseen that performance will be scarcely improved or not improved at all. The proposed mechanism effectively reduces the memory bandwidth requirements of some cores with respect to a typical always prefetching mechanism, so making available extra bandwidth to the co-runners. Results in a four-core processor show that ADP prefetching achieves similar performance ±2.5% as always prefetching, while significantly reducing the memory bandwidth consumed by use-less prefetches. Moreover, in some applications this reduction is as much as 50%. ADP prefetching is applicable to stream-based prefetchers, global-history-buffer delta correlation prefetchers, and PC-based stride prefetchers.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"132 6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP.2016.47","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Prefetching significantly reduces the memory latencies of a wide range of applications and thus increases the system performance. However, as a speculative technique, prefetching may also noticeably increase the number of memory accesses, which in turns may negatively impact on the main memory bandwidth consumption, performance, and power. Main memory bandwidth consumption is a critical resource especially in the context of current multicore processors since memory requests from all the cores, both prefetch and demand requests, compete among them in the access to the DRAM banks. Consequently, demand requests may be delayed hurting the system performance. This work proposes the Activation/Deactivation Policies (ADP) scheme for hardware prefetchers in multicore processors. This scheme relies on activation policies that turn on the prefetcher on a given core when it is expected that prefetches will improve the performance, and turn off the prefetcher of that core when it is foreseen that performance will be scarcely improved or not improved at all. The proposed mechanism effectively reduces the memory bandwidth requirements of some cores with respect to a typical always prefetching mechanism, so making available extra bandwidth to the co-runners. Results in a four-core processor show that ADP prefetching achieves similar performance ±2.5% as always prefetching, while significantly reducing the memory bandwidth consumed by use-less prefetches. Moreover, in some applications this reduction is as much as 50%. ADP prefetching is applicable to stream-based prefetchers, global-history-buffer delta correlation prefetchers, and PC-based stride prefetchers.
一个简单的芯片多处理器激活/停用预取方案
预取可以显著降低各种应用程序的内存延迟,从而提高系统性能。然而,作为一种推测性技术,预取也可能显著增加内存访问的数量,这反过来可能对主内存带宽消耗、性能和功耗产生负面影响。主存带宽消耗是一项关键资源,特别是在当前的多核处理器环境中,因为来自所有核心的内存请求,包括预取请求和需求请求,都在访问DRAM库时相互竞争。因此,需求请求可能会延迟,从而影响系统性能。本文提出了多核处理器中硬件预取器的激活/停用策略(ADP)方案。该方案依赖于激活策略,当预取预计将提高性能时,在给定的核心上打开预取器,当预计性能几乎没有提高或根本没有提高时,关闭该核心的预取器。与典型的总是预取机制相比,所提出的机制有效地降低了某些核心的内存带宽需求,从而为协同运行程序提供了额外的带宽。在四核处理器上的结果表明,ADP预取与始终预取的性能相似±2.5%,同时显着减少了无用预取所消耗的内存带宽。此外,在某些应用中,这种降低高达50%。ADP预取适用于基于流的预取器、全局历史缓冲区增量相关预取器和基于pc的跨步预取器。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信