Buffer-on-board memory systems

E. Cooper-Balis, P. Rosenfeld, B. Jacob
{"title":"Buffer-on-board memory systems","authors":"E. Cooper-Balis, P. Rosenfeld, B. Jacob","doi":"10.1145/2366231.2337204","DOIUrl":null,"url":null,"abstract":"The design and implementation of the commodity memory architecture has resulted in significant performance and capacity limitations. To circumvent these limitations, designers and vendors have begun to place intermediate logic between the CPU and DRAM. This additional logic has two functions: to control the DRAM and to communicate with the CPU over a fast and narrow bus. The benefit provided by this logic is a reduction in pin-out to the memory system and increased signal integrity to the DRAM, allowing faster clock rates while maintaining capacity. While the few vendors utilizing this design have used the same general approach, their implementations vary greatly in their non-trivial details. A hardware-verified simulation suite is developed to accurately model and evaluate the behavior of this buffer-on-board memory system. A study of this design space is used to determine optimal use of the resources involved. This includes DRAM and bus organization, queue storage, and mapping schemes. Various constraints based on implementation costs are placed on simulated configurations to confirm that these optimizations apply to viable systems. Finally, full system simulations are performed to better understand how this memory system interacts with an operating system executing an application with the goal of uncovering behaviors not present in simple limit case simulations. When applying insights gleaned from these simulations, optimal performance can be achieved while still considering outside constraints (i.e., pin-out, power, and fabrication costs).","PeriodicalId":193578,"journal":{"name":"2012 39th Annual International Symposium on Computer Architecture (ISCA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"54","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 39th Annual International Symposium on Computer Architecture (ISCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2366231.2337204","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 54

Abstract

The design and implementation of the commodity memory architecture has resulted in significant performance and capacity limitations. To circumvent these limitations, designers and vendors have begun to place intermediate logic between the CPU and DRAM. This additional logic has two functions: to control the DRAM and to communicate with the CPU over a fast and narrow bus. The benefit provided by this logic is a reduction in pin-out to the memory system and increased signal integrity to the DRAM, allowing faster clock rates while maintaining capacity. While the few vendors utilizing this design have used the same general approach, their implementations vary greatly in their non-trivial details. A hardware-verified simulation suite is developed to accurately model and evaluate the behavior of this buffer-on-board memory system. A study of this design space is used to determine optimal use of the resources involved. This includes DRAM and bus organization, queue storage, and mapping schemes. Various constraints based on implementation costs are placed on simulated configurations to confirm that these optimizations apply to viable systems. Finally, full system simulations are performed to better understand how this memory system interacts with an operating system executing an application with the goal of uncovering behaviors not present in simple limit case simulations. When applying insights gleaned from these simulations, optimal performance can be achieved while still considering outside constraints (i.e., pin-out, power, and fabrication costs).
缓冲板载存储器系统
商品内存架构的设计和实现导致了显著的性能和容量限制。为了规避这些限制,设计人员和供应商已经开始在CPU和DRAM之间放置中间逻辑。这个额外的逻辑有两个功能:控制DRAM和通过快速窄总线与CPU通信。这种逻辑提供的好处是减少了到内存系统的引脚,增加了到DRAM的信号完整性,在保持容量的同时允许更快的时钟速率。虽然少数使用此设计的供应商使用了相同的通用方法,但它们的实现在重要细节上差异很大。开发了一个硬件验证的仿真套件,以准确地建模和评估该缓冲板载存储系统的行为。对该设计空间的研究用于确定所涉及资源的最佳使用。这包括DRAM和总线组织、队列存储和映射方案。基于实现成本的各种约束被放置在模拟配置上,以确认这些优化适用于可行的系统。最后,执行完整的系统模拟以更好地理解该内存系统如何与执行应用程序的操作系统交互,目的是揭示在简单的极限情况模拟中不存在的行为。当应用从这些模拟中收集到的见解时,可以在仍然考虑外部约束(即引脚、功率和制造成本)的情况下实现最佳性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信