Shaizeen Aga, Supreet Jeloka, Arun K. Subramaniyan, S. Narayanasamy, D. Blaauw, R. Das
{"title":"Compute Caches","authors":"Shaizeen Aga, Supreet Jeloka, Arun K. Subramaniyan, S. Narayanasamy, D. Blaauw, R. Das","doi":"10.1109/HPCA.2017.21","DOIUrl":null,"url":null,"abstract":"This paper presents the Compute Cache architecturethat enables in-place computation in caches. ComputeCaches uses emerging bit-line SRAM circuit technology to repurpose existing cache elements and transforms them into active very large vector computational units. Also, it significantlyreduces the overheads in moving data between different levelsin the cache hierarchy. Solutions to satisfy new constraints imposed by ComputeCaches such as operand locality are discussed. Also discussedare simple solutions to problems in integrating them into aconventional cache hierarchy while preserving properties suchas coherence, consistency, and reliability. Compute Caches increase performance by 1.9× and reduceenergy by 2.4× for a suite of data-centric applications, includingtext and database query processing, cryptographic kernels, and in-memory checkpointing. Applications with larger fractionof Compute Cache operations could benefit even more, asour micro-benchmarks indicate (54× throughput, 9× dynamicenergy savings).","PeriodicalId":118950,"journal":{"name":"2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"241","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2017.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 241
Abstract
This paper presents the Compute Cache architecturethat enables in-place computation in caches. ComputeCaches uses emerging bit-line SRAM circuit technology to repurpose existing cache elements and transforms them into active very large vector computational units. Also, it significantlyreduces the overheads in moving data between different levelsin the cache hierarchy. Solutions to satisfy new constraints imposed by ComputeCaches such as operand locality are discussed. Also discussedare simple solutions to problems in integrating them into aconventional cache hierarchy while preserving properties suchas coherence, consistency, and reliability. Compute Caches increase performance by 1.9× and reduceenergy by 2.4× for a suite of data-centric applications, includingtext and database query processing, cryptographic kernels, and in-memory checkpointing. Applications with larger fractionof Compute Cache operations could benefit even more, asour micro-benchmarks indicate (54× throughput, 9× dynamicenergy savings).