{"title":"Tag tables","authors":"Sean Franey, Mikko H. Lipasti","doi":"10.1109/HPCA.2015.7056059","DOIUrl":null,"url":null,"abstract":"Tag Tables enable storage of tags for very large set-associative caches - such as those afforded by 3D DRAM integration - with fine-grained block sizes (e.g. 64B) with low enough overhead to be feasibly implemented on the processor die in SRAM. This approach differs from previous proposals utilizing small block sizes which have assumed that on-chip tag arrays for DRAM caches are too expensive and have consequently stored them with the data in the DRAM itself. Tag Tables are able to avoid the costly overhead of traditional tag arrays by exploiting the natural spatial locality of applications to track the location of data in the cache via a compact \"base-plus-offset\" encoding. Further, Tag Tables leverage the on-demand nature of a forward page table structure to only allocate storage for those entries that correspond to data currently present in the cache, as opposed to the static cost imposed by a traditional tag array. Through high associativity, we show that Tag Tables provide an average performance improvement of more than 10% over the prior state-of-the-art - Alloy Cache - 44% more than the Loh-Hill Cache due to fast on-chip lookups, and 58% over a no-L4 system through a range of multithreaded and multiprogrammed workloads with high L3 miss rates.","PeriodicalId":6593,"journal":{"name":"2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)","volume":"40 1","pages":"514-525"},"PeriodicalIF":0.0000,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2015.7056059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 32
Abstract
Tag Tables enable storage of tags for very large set-associative caches - such as those afforded by 3D DRAM integration - with fine-grained block sizes (e.g. 64B) with low enough overhead to be feasibly implemented on the processor die in SRAM. This approach differs from previous proposals utilizing small block sizes which have assumed that on-chip tag arrays for DRAM caches are too expensive and have consequently stored them with the data in the DRAM itself. Tag Tables are able to avoid the costly overhead of traditional tag arrays by exploiting the natural spatial locality of applications to track the location of data in the cache via a compact "base-plus-offset" encoding. Further, Tag Tables leverage the on-demand nature of a forward page table structure to only allocate storage for those entries that correspond to data currently present in the cache, as opposed to the static cost imposed by a traditional tag array. Through high associativity, we show that Tag Tables provide an average performance improvement of more than 10% over the prior state-of-the-art - Alloy Cache - 44% more than the Loh-Hill Cache due to fast on-chip lookups, and 58% over a no-L4 system through a range of multithreaded and multiprogrammed workloads with high L3 miss rates.