Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors

23rd Annual International Symposium on Computer Architecture (ISCA'96) Pub Date : 1996-05-15 DOI:10.1145/232973.233000

M. Horowitz, M. Martonosi, T. Mowry, Michael D. Smith

{"title":"Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors","authors":"M. Horowitz, M. Martonosi, T. Mowry, Michael D. Smith","doi":"10.1145/232973.233000","DOIUrl":null,"url":null,"abstract":"Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to address this problem successfully in specific situations. However, the generality of these software approaches has been limited because current architectures do not provide a fine-grained, low-overhead mechanism for observing and reacting to memory behavior directly. To fill this need, we propose a new class of memory operations called informing memory operations, which essentially consist of a memory operation combined (either implicitly or explicitly) with a conditional branch-and-link operation that is taken only if the reference suffers a cache miss. We describe two different implementations of informing memory operations---one based on a cache-outcome condition code and another based on low-overhead traps---and find that modern in-order-issue and out-of-order-issue superscalar processors already contain the bulk of the necessary hardware support. We describe how a number of software-based memory optimizations can exploit informing memory operations to enhance performance, and look at cache coherence with fine-grained access control as a case study. Our performance results demonstrate that the runtime overhead of invoking the informing mechanism on the Alpha 21164 and MIPS R10000 processors is generally small enough to provide considerable flexibility to hardware and software designers, and that the cache coherence application has improved performance compared to other current solutions. We believe that the inclusion of informing memory operations in future processors may spur even more innovative performance optimizations.","PeriodicalId":415354,"journal":{"name":"23rd Annual International Symposium on Computer Architecture (ISCA'96)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1996-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"100","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"23rd Annual International Symposium on Computer Architecture (ISCA'96)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/232973.233000","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 100

Abstract

Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to address this problem successfully in specific situations. However, the generality of these software approaches has been limited because current architectures do not provide a fine-grained, low-overhead mechanism for observing and reacting to memory behavior directly. To fill this need, we propose a new class of memory operations called informing memory operations, which essentially consist of a memory operation combined (either implicitly or explicitly) with a conditional branch-and-link operation that is taken only if the reference suffers a cache miss. We describe two different implementations of informing memory operations---one based on a cache-outcome condition code and another based on low-overhead traps---and find that modern in-order-issue and out-of-order-issue superscalar processors already contain the bulk of the necessary hardware support. We describe how a number of software-based memory optimizations can exploit informing memory operations to enhance performance, and look at cache coherence with fine-grained access control as a case study. Our performance results demonstrate that the runtime overhead of invoking the informing mechanism on the Alpha 21164 and MIPS R10000 processors is generally small enough to provide considerable flexibility to hardware and software designers, and that the cache coherence application has improved performance compared to other current solutions. We believe that the inclusion of informing memory operations in future processors may spur even more innovative performance optimizations.

查看原文本刊更多论文

通知内存操作:在现代处理器中提供内存性能反馈

内存延迟是系统性能的一个重要瓶颈，仅靠硬件是无法充分解决的。一些有前途的软件技术已经被证明可以在特定情况下成功地解决这个问题。然而，这些软件方法的通用性受到了限制，因为当前的体系结构没有提供一种细粒度的、低开销的机制来直接观察和响应内存行为。为了满足这一需求，我们提出了一类新的内存操作，叫做通知内存操作，它本质上由一个内存操作(隐式或显式)与一个条件分支链接操作(仅在引用遭受缓存丢失时才执行)相结合组成。我们描述了通知内存操作的两种不同实现——一种基于缓存结果条件代码，另一种基于低开销陷阱——并发现现代无序问题和无序问题标量处理器已经包含了大量必要的硬件支持。我们描述了许多基于软件的内存优化如何利用通知内存操作来提高性能，并将缓存一致性与细粒度访问控制作为案例研究。我们的性能结果表明，在Alpha 21164和MIPS R10000处理器上调用通知机制的运行时开销通常足够小，可以为硬件和软件设计人员提供相当大的灵活性，并且与其他当前解决方案相比，缓存一致性应用程序提高了性能。我们相信，在未来的处理器中包含通知内存操作可能会激发更多创新的性能优化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

23rd Annual International Symposium on Computer Architecture (ISCA'96)

自引率

0.00%

发文量