Li Zhao, R. Iyer, S. Makineni, D. Newell, Liqun Cheng
{"title":"NCID:一种非包容性缓存,包容性目录架构,用于灵活高效的缓存层次结构","authors":"Li Zhao, R. Iyer, S. Makineni, D. Newell, Liqun Cheng","doi":"10.1145/1787275.1787314","DOIUrl":null,"url":null,"abstract":"Chip-multiprocessor (CMP) architectures employ multi-level cache hierarchies with private L2 caches per core and a shared L3 cache like Intel's Nehalem processor and AMD's Barcelona processor. When designing a multi-level cache hierarchy, one of the key design choices is the inclusion policy: inclusive, non-inclusive or exclusive. Either choice has its benefits and drawbacks. An inclusive cache hierarchy (like Nehalem's L3) has the benefit of allowing incoming snoops to be filtered at the L3 cache, but suffers from (a) reduced space efficiency due to replication between the L2 and L3 caches and (b) reduced flexibility since it cannot bypass the L3 cache for transient or low priority data. In an inclusive L2/L3 cache hierarchy, it also becomes difficult to flexibly chop L3 cache size (or increase L2 cache size) for different product instantiations because the inclusion can start to affect performance (due to significant back-invalidates). In this paper, we present a novel approach to addressing the drawbacks of inclusive caches, while retaining its positive features of snoop filtering. We present NCID: a non-inclusive cache, inclusive directory architecture that allows data in the L3 to be non-inclusive or exclusive, but retains tag inclusion in the directory to support complete snoop filtering. We then describe and evaluate a range of NCID-based architecture options and policies. Our evaluation shows that NCID enables a flexible and efficient cache hierarchy for future CMP platforms and has the potential to improve performance significantly for several important server benchmarks.","PeriodicalId":151791,"journal":{"name":"Proceedings of the 7th ACM international conference on Computing frontiers","volume":"95 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":"{\"title\":\"NCID: a non-inclusive cache, inclusive directory architecture for flexible and efficient cache hierarchies\",\"authors\":\"Li Zhao, R. Iyer, S. Makineni, D. Newell, Liqun Cheng\",\"doi\":\"10.1145/1787275.1787314\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Chip-multiprocessor (CMP) architectures employ multi-level cache hierarchies with private L2 caches per core and a shared L3 cache like Intel's Nehalem processor and AMD's Barcelona processor. When designing a multi-level cache hierarchy, one of the key design choices is the inclusion policy: inclusive, non-inclusive or exclusive. Either choice has its benefits and drawbacks. An inclusive cache hierarchy (like Nehalem's L3) has the benefit of allowing incoming snoops to be filtered at the L3 cache, but suffers from (a) reduced space efficiency due to replication between the L2 and L3 caches and (b) reduced flexibility since it cannot bypass the L3 cache for transient or low priority data. In an inclusive L2/L3 cache hierarchy, it also becomes difficult to flexibly chop L3 cache size (or increase L2 cache size) for different product instantiations because the inclusion can start to affect performance (due to significant back-invalidates). In this paper, we present a novel approach to addressing the drawbacks of inclusive caches, while retaining its positive features of snoop filtering. We present NCID: a non-inclusive cache, inclusive directory architecture that allows data in the L3 to be non-inclusive or exclusive, but retains tag inclusion in the directory to support complete snoop filtering. We then describe and evaluate a range of NCID-based architecture options and policies. Our evaluation shows that NCID enables a flexible and efficient cache hierarchy for future CMP platforms and has the potential to improve performance significantly for several important server benchmarks.\",\"PeriodicalId\":151791,\"journal\":{\"name\":\"Proceedings of the 7th ACM international conference on Computing frontiers\",\"volume\":\"95 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"32\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th ACM international conference on Computing frontiers\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1787275.1787314\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th ACM international conference on Computing frontiers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1787275.1787314","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
NCID: a non-inclusive cache, inclusive directory architecture for flexible and efficient cache hierarchies
Chip-multiprocessor (CMP) architectures employ multi-level cache hierarchies with private L2 caches per core and a shared L3 cache like Intel's Nehalem processor and AMD's Barcelona processor. When designing a multi-level cache hierarchy, one of the key design choices is the inclusion policy: inclusive, non-inclusive or exclusive. Either choice has its benefits and drawbacks. An inclusive cache hierarchy (like Nehalem's L3) has the benefit of allowing incoming snoops to be filtered at the L3 cache, but suffers from (a) reduced space efficiency due to replication between the L2 and L3 caches and (b) reduced flexibility since it cannot bypass the L3 cache for transient or low priority data. In an inclusive L2/L3 cache hierarchy, it also becomes difficult to flexibly chop L3 cache size (or increase L2 cache size) for different product instantiations because the inclusion can start to affect performance (due to significant back-invalidates). In this paper, we present a novel approach to addressing the drawbacks of inclusive caches, while retaining its positive features of snoop filtering. We present NCID: a non-inclusive cache, inclusive directory architecture that allows data in the L3 to be non-inclusive or exclusive, but retains tag inclusion in the directory to support complete snoop filtering. We then describe and evaluate a range of NCID-based architecture options and policies. Our evaluation shows that NCID enables a flexible and efficient cache hierarchy for future CMP platforms and has the potential to improve performance significantly for several important server benchmarks.