Elvira Teran, Yingying Tian, Zhe Wang, Daniel A. Jiménez
{"title":"最小干扰放置和提升","authors":"Elvira Teran, Yingying Tian, Zhe Wang, Daniel A. Jiménez","doi":"10.1109/HPCA.2016.7446065","DOIUrl":null,"url":null,"abstract":"Cache replacement policies often order blocks into distinct positions. A block is placed into a set in some initial position. A re-referenced block is promoted into a higher position while other blocks may move into lower positions. A block in the lowest position is a candidate for replacement. Tree-based PseudoLRU is a well-known space-efficient replacement policy based on representing block positions as distinct paths in a binary tree. We find that a placement or promotion for one block often needlessly disturbs the non-promoted blocks. Guided by the principle of minimal disturbance, i.e. that a policy should seek to disturb the order of non-promoted blocks to the smallest extent possible, we develop a simple modification to PseudoLRU resulting in a policy that improves performance over previous techniques while retaining the low cost of PseudoLRU. The result is a minimal disturbance placement and promotion (MDPP) policy. We first give a static formulation of MDPP and show that it provides superior performance to LRU, PseudoLRU and matches performance for SRRIP for both single-threaded and multi-core workloads. We then give a dynamic formulation that uses dead block prediction for placement and bypass and show that it meets or exceeds state-of-the-art performance with lower overhead. For single-threaded workloads, dynamic MDPP matches the 5.9% speedup over LRU of the state-of-the-art policy SHiP. For multi-core workloads, dynamic MDPP gives a normalized weighted speedup of 14.3% over LRU, compared with SHiP that yields a speedup of 12.3% over LRU and requires double the storage overhead per set. We show that minimal disturbance policies can reduce the frequency of a costly read-modify-write cycle for replacement state, making them potentially suitable for future work in DRAM caches.","PeriodicalId":417994,"journal":{"name":"2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Minimal disturbance placement and promotion\",\"authors\":\"Elvira Teran, Yingying Tian, Zhe Wang, Daniel A. Jiménez\",\"doi\":\"10.1109/HPCA.2016.7446065\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cache replacement policies often order blocks into distinct positions. A block is placed into a set in some initial position. A re-referenced block is promoted into a higher position while other blocks may move into lower positions. A block in the lowest position is a candidate for replacement. Tree-based PseudoLRU is a well-known space-efficient replacement policy based on representing block positions as distinct paths in a binary tree. We find that a placement or promotion for one block often needlessly disturbs the non-promoted blocks. Guided by the principle of minimal disturbance, i.e. that a policy should seek to disturb the order of non-promoted blocks to the smallest extent possible, we develop a simple modification to PseudoLRU resulting in a policy that improves performance over previous techniques while retaining the low cost of PseudoLRU. The result is a minimal disturbance placement and promotion (MDPP) policy. We first give a static formulation of MDPP and show that it provides superior performance to LRU, PseudoLRU and matches performance for SRRIP for both single-threaded and multi-core workloads. We then give a dynamic formulation that uses dead block prediction for placement and bypass and show that it meets or exceeds state-of-the-art performance with lower overhead. For single-threaded workloads, dynamic MDPP matches the 5.9% speedup over LRU of the state-of-the-art policy SHiP. For multi-core workloads, dynamic MDPP gives a normalized weighted speedup of 14.3% over LRU, compared with SHiP that yields a speedup of 12.3% over LRU and requires double the storage overhead per set. We show that minimal disturbance policies can reduce the frequency of a costly read-modify-write cycle for replacement state, making them potentially suitable for future work in DRAM caches.\",\"PeriodicalId\":417994,\"journal\":{\"name\":\"2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)\",\"volume\":\"64 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCA.2016.7446065\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2016.7446065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cache replacement policies often order blocks into distinct positions. A block is placed into a set in some initial position. A re-referenced block is promoted into a higher position while other blocks may move into lower positions. A block in the lowest position is a candidate for replacement. Tree-based PseudoLRU is a well-known space-efficient replacement policy based on representing block positions as distinct paths in a binary tree. We find that a placement or promotion for one block often needlessly disturbs the non-promoted blocks. Guided by the principle of minimal disturbance, i.e. that a policy should seek to disturb the order of non-promoted blocks to the smallest extent possible, we develop a simple modification to PseudoLRU resulting in a policy that improves performance over previous techniques while retaining the low cost of PseudoLRU. The result is a minimal disturbance placement and promotion (MDPP) policy. We first give a static formulation of MDPP and show that it provides superior performance to LRU, PseudoLRU and matches performance for SRRIP for both single-threaded and multi-core workloads. We then give a dynamic formulation that uses dead block prediction for placement and bypass and show that it meets or exceeds state-of-the-art performance with lower overhead. For single-threaded workloads, dynamic MDPP matches the 5.9% speedup over LRU of the state-of-the-art policy SHiP. For multi-core workloads, dynamic MDPP gives a normalized weighted speedup of 14.3% over LRU, compared with SHiP that yields a speedup of 12.3% over LRU and requires double the storage overhead per set. We show that minimal disturbance policies can reduce the frequency of a costly read-modify-write cycle for replacement state, making them potentially suitable for future work in DRAM caches.