fpga上的区域高效近联想存储器

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI:10.1145/2435264.2435298

Udit Dhawan, A. DeHon

{"title":"fpga上的区域高效近联想存储器","authors":"Udit Dhawan, A. DeHon","doi":"10.1145/2435264.2435298","DOIUrl":null,"url":null,"abstract":"Associative memories can map sparsely used keys to values with low latency but can incur heavy area overheads. The lack of customized hardware for associative memories in today's mainstream FPGAs exacerbates the overhead cost of building these memories using the fixed address match BRAMs. In this paper, we develop a new, FPGA-friendly, memory architecture based on a multiple hash scheme that is able to achieve near-associative performance (less than 5% of evictions due to conflicts) without the area overheads of a fully associative memory on FPGAs. Using the proposed architecture as a 64KB L1 data cache, we show that it is able to achieve near-associative miss-rates while consuming 6-7× less FPGA memory resources for a set of benchmark programs from the SPEC2006 suite than fully associative memories generated by the Xilinx Coregen tool. Benefits increase with match width, allowing area reduction up to 100×. At the same time, the new architecture has lower latency than the fully associative memory -- 3.7 ns for a 1024-entry flat version or 6.1 ns for an area-efficient version compared to 8.8 ns for a fully associative memory for a 64b key.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"80 1","pages":"191-200"},"PeriodicalIF":0.0000,"publicationDate":"2013-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Area-efficient near-associative memories on FPGAs\",\"authors\":\"Udit Dhawan, A. DeHon\",\"doi\":\"10.1145/2435264.2435298\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Associative memories can map sparsely used keys to values with low latency but can incur heavy area overheads. The lack of customized hardware for associative memories in today's mainstream FPGAs exacerbates the overhead cost of building these memories using the fixed address match BRAMs. In this paper, we develop a new, FPGA-friendly, memory architecture based on a multiple hash scheme that is able to achieve near-associative performance (less than 5% of evictions due to conflicts) without the area overheads of a fully associative memory on FPGAs. Using the proposed architecture as a 64KB L1 data cache, we show that it is able to achieve near-associative miss-rates while consuming 6-7× less FPGA memory resources for a set of benchmark programs from the SPEC2006 suite than fully associative memories generated by the Xilinx Coregen tool. Benefits increase with match width, allowing area reduction up to 100×. At the same time, the new architecture has lower latency than the fully associative memory -- 3.7 ns for a 1024-entry flat version or 6.1 ns for an area-efficient version compared to 8.8 ns for a fully associative memory for a 64b key.\",\"PeriodicalId\":87257,\"journal\":{\"name\":\"FPGA. ACM International Symposium on Field-Programmable Gate Arrays\",\"volume\":\"80 1\",\"pages\":\"191-200\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-02-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"FPGA. ACM International Symposium on Field-Programmable Gate Arrays\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2435264.2435298\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2435264.2435298","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

关联内存可以将稀疏使用的键映射到具有低延迟的值，但可能会导致大量的区域开销。在当今的主流fpga中，缺乏用于关联存储器的定制硬件，这加剧了使用固定地址匹配bram构建这些存储器的开销成本。在本文中，我们基于多重散列方案开发了一种新的fpga友好的内存架构，该架构能够实现近关联性能(由于冲突导致的驱逐少于5%)，而无需fpga上完全关联内存的面积开销。使用所提出的架构作为64KB L1数据缓存，我们表明它能够实现近关联缺失率，同时消耗的FPGA内存资源比由Xilinx Coregen工具生成的完全关联内存少6-7倍，用于SPEC2006套件的一组基准程序。利益随着比赛宽度的增加而增加，允许面积减少到100倍。与此同时，新架构的延迟比完全关联存储器低——1024条目平面版本为3.7 ns，面积高效版本为6.1 ns，而64b键的完全关联存储器为8.8 ns。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Area-efficient near-associative memories on FPGAs

Associative memories can map sparsely used keys to values with low latency but can incur heavy area overheads. The lack of customized hardware for associative memories in today's mainstream FPGAs exacerbates the overhead cost of building these memories using the fixed address match BRAMs. In this paper, we develop a new, FPGA-friendly, memory architecture based on a multiple hash scheme that is able to achieve near-associative performance (less than 5% of evictions due to conflicts) without the area overheads of a fully associative memory on FPGAs. Using the proposed architecture as a 64KB L1 data cache, we show that it is able to achieve near-associative miss-rates while consuming 6-7× less FPGA memory resources for a set of benchmark programs from the SPEC2006 suite than fully associative memories generated by the Xilinx Coregen tool. Benefits increase with match width, allowing area reduction up to 100×. At the same time, the new architecture has lower latency than the fully associative memory -- 3.7 ns for a 1024-entry flat version or 6.1 ns for an area-efficient version compared to 8.8 ns for a fully associative memory for a 64b key.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

FPGA. ACM International Symposium on Field-Programmable Gate Arrays

自引率

0.00%

发文量