Fully-Buffered DIMM Memory Architectures: Understanding Mechanisms, Overheads and Scaling

2007 IEEE 13th International Symposium on High Performance Computer Architecture Pub Date : 2007-02-10 DOI:10.1109/HPCA.2007.346190

B. Ganesh, A. Jaleel, David T. Wang, B. Jacob

{"title":"Fully-Buffered DIMM Memory Architectures: Understanding Mechanisms, Overheads and Scaling","authors":"B. Ganesh, A. Jaleel, David T. Wang, B. Jacob","doi":"10.1109/HPCA.2007.346190","DOIUrl":null,"url":null,"abstract":"Performance gains in memory have traditionally been obtained by increasing memory bus widths and speeds. The diminishing returns of such techniques have led to the proposal of an alternate architecture, the fully-buffered DIMM. This new standard replaces the conventional memory bus with a narrow, high-speed interface between the memory controller and the DIMMs. This paper examines how traditional DDRx based memory controller policies for scheduling and row buffer management perform on a fully-buffered DIMM memory architecture. The split-bus architecture used by FBDIMM systems results in an average improvement of 7% in latency and 10% in bandwidth at higher utilizations. On the other hand, at lower utilizations, the increased cost of serialization resulted in a degradation in latency and bandwidth of 25% and 10% respectively. The split-bus architecture also makes the system performance sensitive to the ratio of read and write traffic in the workload. In larger configurations, we found that the FBDIMM system performance was more sensitive to usage of the FBDIMM links than to DRAM bank availability. In general, FBDIMM performance is similar to that of DDRx systems, and provides better performance characteristics at higher utilization, making it a relatively inexpensive mechanism for scaling capacity at higher bandwidth requirements. The mechanism is also largely insensitive to scheduling policies, provided certain ground rules are obeyed","PeriodicalId":177324,"journal":{"name":"2007 IEEE 13th International Symposium on High Performance Computer Architecture","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"88","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE 13th International Symposium on High Performance Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2007.346190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 88

Abstract

Performance gains in memory have traditionally been obtained by increasing memory bus widths and speeds. The diminishing returns of such techniques have led to the proposal of an alternate architecture, the fully-buffered DIMM. This new standard replaces the conventional memory bus with a narrow, high-speed interface between the memory controller and the DIMMs. This paper examines how traditional DDRx based memory controller policies for scheduling and row buffer management perform on a fully-buffered DIMM memory architecture. The split-bus architecture used by FBDIMM systems results in an average improvement of 7% in latency and 10% in bandwidth at higher utilizations. On the other hand, at lower utilizations, the increased cost of serialization resulted in a degradation in latency and bandwidth of 25% and 10% respectively. The split-bus architecture also makes the system performance sensitive to the ratio of read and write traffic in the workload. In larger configurations, we found that the FBDIMM system performance was more sensitive to usage of the FBDIMM links than to DRAM bank availability. In general, FBDIMM performance is similar to that of DDRx systems, and provides better performance characteristics at higher utilization, making it a relatively inexpensive mechanism for scaling capacity at higher bandwidth requirements. The mechanism is also largely insensitive to scheduling policies, provided certain ground rules are obeyed

查看原文本刊更多论文

全缓冲内存架构:理解机制，开销和扩展

传统上，内存的性能提升是通过增加内存总线的宽度和速度来实现的。由于这种技术的收益递减，人们提出了另一种架构，即全缓冲DIMM。这个新标准用内存控制器和内存条之间的窄高速接口取代了传统的内存总线。本文研究了传统的基于DDRx的内存控制器策略如何在全缓冲的DIMM内存架构上执行调度和行缓冲区管理。FBDIMM系统采用分总线架构，在高利用率的情况下，延迟平均提高7%，带宽平均提高10%。另一方面，在较低的利用率下，序列化成本的增加会导致延迟和带宽分别下降25%和10%。分离总线架构还使系统性能对工作负载中读写流量的比例非常敏感。在较大的配置中，我们发现FBDIMM系统性能对FBDIMM链路的使用比对DRAM库可用性更敏感。一般来说，FBDIMM的性能与DDRx系统相似，并且在更高的利用率下提供更好的性能特征，使其成为在更高带宽要求下扩展容量的相对廉价的机制。只要遵守某些基本规则，该机制在很大程度上对调度策略不敏感

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2007 IEEE 13th International Symposium on High Performance Computer Architecture

自引率

0.00%

发文量