High-Bandwidth Address Translation for Multiple-Issue Processors

T. Austin, G. Sohi
{"title":"High-Bandwidth Address Translation for Multiple-Issue Processors","authors":"T. Austin, G. Sohi","doi":"10.1145/232973.232990","DOIUrl":null,"url":null,"abstract":"In an effort to push the envelope of system performance, microprocessor designs are continually exploiting higher levels of instruction-level parallelism, resulting in increasing bandwidth demands on the address translation mechanism. Most current microprocessor designs meet this demand with a multi-ported TLB. While this design provides an excellent hit rate at each port, its access latency and area grow very quickly as the number of ports is increased. As bandwidth demands continue to increase, multi-ported designs will soon impact memory access latency.We present four high-bandwidth address translation mechanisms with latency and area characteristics that scale better than a multiported TLB design. We extend traditional high-bandwidth memory design techniques to address translation, developing interleaved and multi-level TLB designs. In addition, we introduce two new designs crafted specifically for high-bandwidth address translation. Piggyback ports are introduced as a technique to exploit spatial locality in simultaneous translation requests, allowing accesses to the same virtual memory page to combine their requests at the TLB access port. Pretranslation is introduced as a technique for attaching translations to base register values, making it possible to reuse a single translation many times.We perform extensive simulation-based studies to evaluate our designs. We vary key system parameters, such as processor model, page size, and number of architected registers, to see what effects these changes have on the relative merits of each approach. A number of designs show particular promise. Multi-level TLBs with as few as eight entries in the upper-level TLB nearly achieve the performance of a TLB with unlimited bandwidth. Piggyback ports combined with a lesser-ported TLB structure, e.g., an interleaved or multi-ported TLB, also perform well. Pretranslation over a single-ported TLB performs almost as well as a same-sized multi-level TLB with the added benefit of decreased access latency for physically indexed caches.","PeriodicalId":415354,"journal":{"name":"23rd Annual International Symposium on Computer Architecture (ISCA'96)","volume":"267 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1996-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"23rd Annual International Symposium on Computer Architecture (ISCA'96)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/232973.232990","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 48

Abstract

In an effort to push the envelope of system performance, microprocessor designs are continually exploiting higher levels of instruction-level parallelism, resulting in increasing bandwidth demands on the address translation mechanism. Most current microprocessor designs meet this demand with a multi-ported TLB. While this design provides an excellent hit rate at each port, its access latency and area grow very quickly as the number of ports is increased. As bandwidth demands continue to increase, multi-ported designs will soon impact memory access latency.We present four high-bandwidth address translation mechanisms with latency and area characteristics that scale better than a multiported TLB design. We extend traditional high-bandwidth memory design techniques to address translation, developing interleaved and multi-level TLB designs. In addition, we introduce two new designs crafted specifically for high-bandwidth address translation. Piggyback ports are introduced as a technique to exploit spatial locality in simultaneous translation requests, allowing accesses to the same virtual memory page to combine their requests at the TLB access port. Pretranslation is introduced as a technique for attaching translations to base register values, making it possible to reuse a single translation many times.We perform extensive simulation-based studies to evaluate our designs. We vary key system parameters, such as processor model, page size, and number of architected registers, to see what effects these changes have on the relative merits of each approach. A number of designs show particular promise. Multi-level TLBs with as few as eight entries in the upper-level TLB nearly achieve the performance of a TLB with unlimited bandwidth. Piggyback ports combined with a lesser-ported TLB structure, e.g., an interleaved or multi-ported TLB, also perform well. Pretranslation over a single-ported TLB performs almost as well as a same-sized multi-level TLB with the added benefit of decreased access latency for physically indexed caches.
多问题处理器的高带宽地址转换
为了推动系统性能的极限,微处理器设计不断开发更高级别的指令级并行性,导致对地址转换机制的带宽需求增加。目前大多数微处理器设计都采用多端口TLB来满足这种需求。虽然这种设计在每个端口上提供了极好的命中率,但随着端口数量的增加,其访问延迟和面积增长得非常快。随着带宽需求的不断增加,多端口设计将很快影响内存访问延迟。我们提出了四种具有延迟和区域特性的高带宽地址转换机制,它们比多端口TLB设计具有更好的可扩展性。我们扩展了传统的高带宽存储器设计技术来解决翻译问题,开发了交错和多级TLB设计。此外,我们还介绍了两种专门用于高带宽地址转换的新设计。在同时翻译请求中引入了一种利用空间局部性的技术,允许访问相同的虚拟内存页以在TLB访问端口上组合它们的请求。预翻译是作为一种将翻译附加到基寄存器值的技术引入的,这使得多次重用单个翻译成为可能。我们进行广泛的基于模拟的研究来评估我们的设计。我们改变了关键的系统参数,例如处理器模型、页面大小和体系结构寄存器的数量,以查看这些更改对每种方法的相对优点有什么影响。许多设计显示出特别的希望。上层TLB中只有8个表项的多级TLB几乎可以达到无限带宽的TLB的性能。与较小端口TLB结构(例如,交错或多端口TLB)相结合的承载端口也表现良好。在单端口TLB上进行预翻译的性能几乎与相同大小的多级TLB一样好,并且还可以减少物理索引缓存的访问延迟。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信