{"title":"理解英特尔处理器上的动态缓存:方法和应用","authors":"Yi Zhang, Nan Guan, W. Yi","doi":"10.1109/EUC.2014.18","DOIUrl":null,"url":null,"abstract":"The design and implementation of caches on a given platform has significant impacts to many areas in computer system design. On chip-multiprocessors (CMP), new cache architectures are proposed to meet the rapidly increasing performance requirements. However, the cache architectures are usually not well-documented for commercial processors. This raises difficulties for people to precisely understand the working principle of many components of the processors, not only the cache itself, but also the related components like the whole memory subsystem. This paper aims at disclosing the working principle of the last level cache of Intel Ivy Bridge processors. First, we identify the address translation logic on this cache. Second, we disclose the replacement policy of the cache. This is a dynamic insertion replacement policy, which is very different from the widely used LRU policy and its variants. Although this replacement policy has been proposed in academic literatures, our work is the first one showing it is actually used in commercial processors. To show the significance of our discovery, we design a methodology to generate controllable cache miss sequences under this new cache, and apply it to the design of a benchmark to model the memory concurrency. Evaluations on physical machines are conducted to show the effectiveness of the proposed method.","PeriodicalId":331736,"journal":{"name":"2014 12th IEEE International Conference on Embedded and Ubiquitous Computing","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Understanding the Dynamic Caches on Intel Processors: Methods and Applications\",\"authors\":\"Yi Zhang, Nan Guan, W. Yi\",\"doi\":\"10.1109/EUC.2014.18\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The design and implementation of caches on a given platform has significant impacts to many areas in computer system design. On chip-multiprocessors (CMP), new cache architectures are proposed to meet the rapidly increasing performance requirements. However, the cache architectures are usually not well-documented for commercial processors. This raises difficulties for people to precisely understand the working principle of many components of the processors, not only the cache itself, but also the related components like the whole memory subsystem. This paper aims at disclosing the working principle of the last level cache of Intel Ivy Bridge processors. First, we identify the address translation logic on this cache. Second, we disclose the replacement policy of the cache. This is a dynamic insertion replacement policy, which is very different from the widely used LRU policy and its variants. Although this replacement policy has been proposed in academic literatures, our work is the first one showing it is actually used in commercial processors. To show the significance of our discovery, we design a methodology to generate controllable cache miss sequences under this new cache, and apply it to the design of a benchmark to model the memory concurrency. Evaluations on physical machines are conducted to show the effectiveness of the proposed method.\",\"PeriodicalId\":331736,\"journal\":{\"name\":\"2014 12th IEEE International Conference on Embedded and Ubiquitous Computing\",\"volume\":\"62 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 12th IEEE International Conference on Embedded and Ubiquitous Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EUC.2014.18\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 12th IEEE International Conference on Embedded and Ubiquitous Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUC.2014.18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Understanding the Dynamic Caches on Intel Processors: Methods and Applications
The design and implementation of caches on a given platform has significant impacts to many areas in computer system design. On chip-multiprocessors (CMP), new cache architectures are proposed to meet the rapidly increasing performance requirements. However, the cache architectures are usually not well-documented for commercial processors. This raises difficulties for people to precisely understand the working principle of many components of the processors, not only the cache itself, but also the related components like the whole memory subsystem. This paper aims at disclosing the working principle of the last level cache of Intel Ivy Bridge processors. First, we identify the address translation logic on this cache. Second, we disclose the replacement policy of the cache. This is a dynamic insertion replacement policy, which is very different from the widely used LRU policy and its variants. Although this replacement policy has been proposed in academic literatures, our work is the first one showing it is actually used in commercial processors. To show the significance of our discovery, we design a methodology to generate controllable cache miss sequences under this new cache, and apply it to the design of a benchmark to model the memory concurrency. Evaluations on physical machines are conducted to show the effectiveness of the proposed method.