{"title":"目录Lookaside表:支持可扩展、低冲突、多核缓存一致性目录","authors":"Xudong Shi, Feiqi Su, J. Peir","doi":"10.1109/PADSW.2014.7097798","DOIUrl":null,"url":null,"abstract":"Maintaining hardware cache coherence on future CMPs becomes increasingly important and difficult as the number of cores keeps accelerating in mainstream multicore chips. The simple snooping-bus coherence scheme is not suitable due to its limited scalability. The sparse coherence directory approach may incur extra cache invalidations due to a topological mismatch between the coherence directory and the directories of all cache modules. In this paper, we propose an innovative CMP coherence directory that has three important properties. First, the directory has a simple set-associative design with small associativity. The number of directory entries matches the total number of cache blocks. Second, an augmented Directory Lookaside Table (DLT) allows blocks to be displaced from their primary sets in the coherence directory for alleviating hot-set conflicts. Third, to avoid expensive presence bits, each copy of a block along with the located core ID occupies a separate directory entry. Performance evaluations based on multithreaded and multi-programmed workloads demonstrate significant advantages of the proposed CMP directory over directories with traditional set-associative or skewed associative designs.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Directory Lookaside Table: Enabling scalable, low-conflict, many-core cache coherence directory\",\"authors\":\"Xudong Shi, Feiqi Su, J. Peir\",\"doi\":\"10.1109/PADSW.2014.7097798\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Maintaining hardware cache coherence on future CMPs becomes increasingly important and difficult as the number of cores keeps accelerating in mainstream multicore chips. The simple snooping-bus coherence scheme is not suitable due to its limited scalability. The sparse coherence directory approach may incur extra cache invalidations due to a topological mismatch between the coherence directory and the directories of all cache modules. In this paper, we propose an innovative CMP coherence directory that has three important properties. First, the directory has a simple set-associative design with small associativity. The number of directory entries matches the total number of cache blocks. Second, an augmented Directory Lookaside Table (DLT) allows blocks to be displaced from their primary sets in the coherence directory for alleviating hot-set conflicts. Third, to avoid expensive presence bits, each copy of a block along with the located core ID occupies a separate directory entry. Performance evaluations based on multithreaded and multi-programmed workloads demonstrate significant advantages of the proposed CMP directory over directories with traditional set-associative or skewed associative designs.\",\"PeriodicalId\":421740,\"journal\":{\"name\":\"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PADSW.2014.7097798\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PADSW.2014.7097798","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Maintaining hardware cache coherence on future CMPs becomes increasingly important and difficult as the number of cores keeps accelerating in mainstream multicore chips. The simple snooping-bus coherence scheme is not suitable due to its limited scalability. The sparse coherence directory approach may incur extra cache invalidations due to a topological mismatch between the coherence directory and the directories of all cache modules. In this paper, we propose an innovative CMP coherence directory that has three important properties. First, the directory has a simple set-associative design with small associativity. The number of directory entries matches the total number of cache blocks. Second, an augmented Directory Lookaside Table (DLT) allows blocks to be displaced from their primary sets in the coherence directory for alleviating hot-set conflicts. Third, to avoid expensive presence bits, each copy of a block along with the located core ID occupies a separate directory entry. Performance evaluations based on multithreaded and multi-programmed workloads demonstrate significant advantages of the proposed CMP directory over directories with traditional set-associative or skewed associative designs.