{"title":"基于高效节能GPGPU计算的高级缓存索引可行性研究","authors":"Kyu Yeun Kim, Seunghoe Kim, Woongki Baek","doi":"10.1145/2768177.2768179","DOIUrl":null,"url":null,"abstract":"To achieve higher performance and energy efficiency, GPGPU architectures have recently begun to employ hardware caches. Adding hardware caches to GPGPUs, however, does not automatically guarantee improved performance and energy efficiency due to the thrashing in small hardware caches shared by thousands of threads. While prior work has proposed warp scheduling and cache bypassing techniques to address this issue, relatively little work has been done in the context of advanced cache indexing. To bridge this gap, this work investigates the feasibility of advanced cache indexing for high-performance and energy-efficient GPGPU computing. We first discuss the design and implementation of static and adaptive cache indexing schemes for GPGPUs. We then quantify the effectiveness of the advanced indexing schemes using GPGPU benchmarks. Our quantitative evaluation demonstrates that the advanced cache indexing schemes are promising in that they significantly outperform the conventional cache indexing scheme. In addition, for a subset of cache-sensitive benchmarks, the adaptive indexing scheme substantially outperforms the static indexing scheme by effectively identifying and utilizing high-quality indexing bits based on runtime information. Finally, our evaluation shows that the effectiveness of advanced cache indexing is sensitive to different warp schedulers, motivating further research on coordinated cache indexing and warp scheduling techniques.","PeriodicalId":374555,"journal":{"name":"Proceedings of the 3rd International Workshop on Many-core Embedded Systems","volume":"226 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"On the Feasibility of Advanced Cache Indexing for High-Performance and Energy-Efficient GPGPU Computing\",\"authors\":\"Kyu Yeun Kim, Seunghoe Kim, Woongki Baek\",\"doi\":\"10.1145/2768177.2768179\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To achieve higher performance and energy efficiency, GPGPU architectures have recently begun to employ hardware caches. Adding hardware caches to GPGPUs, however, does not automatically guarantee improved performance and energy efficiency due to the thrashing in small hardware caches shared by thousands of threads. While prior work has proposed warp scheduling and cache bypassing techniques to address this issue, relatively little work has been done in the context of advanced cache indexing. To bridge this gap, this work investigates the feasibility of advanced cache indexing for high-performance and energy-efficient GPGPU computing. We first discuss the design and implementation of static and adaptive cache indexing schemes for GPGPUs. We then quantify the effectiveness of the advanced indexing schemes using GPGPU benchmarks. Our quantitative evaluation demonstrates that the advanced cache indexing schemes are promising in that they significantly outperform the conventional cache indexing scheme. In addition, for a subset of cache-sensitive benchmarks, the adaptive indexing scheme substantially outperforms the static indexing scheme by effectively identifying and utilizing high-quality indexing bits based on runtime information. Finally, our evaluation shows that the effectiveness of advanced cache indexing is sensitive to different warp schedulers, motivating further research on coordinated cache indexing and warp scheduling techniques.\",\"PeriodicalId\":374555,\"journal\":{\"name\":\"Proceedings of the 3rd International Workshop on Many-core Embedded Systems\",\"volume\":\"226 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Workshop on Many-core Embedded Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2768177.2768179\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Workshop on Many-core Embedded Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2768177.2768179","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On the Feasibility of Advanced Cache Indexing for High-Performance and Energy-Efficient GPGPU Computing
To achieve higher performance and energy efficiency, GPGPU architectures have recently begun to employ hardware caches. Adding hardware caches to GPGPUs, however, does not automatically guarantee improved performance and energy efficiency due to the thrashing in small hardware caches shared by thousands of threads. While prior work has proposed warp scheduling and cache bypassing techniques to address this issue, relatively little work has been done in the context of advanced cache indexing. To bridge this gap, this work investigates the feasibility of advanced cache indexing for high-performance and energy-efficient GPGPU computing. We first discuss the design and implementation of static and adaptive cache indexing schemes for GPGPUs. We then quantify the effectiveness of the advanced indexing schemes using GPGPU benchmarks. Our quantitative evaluation demonstrates that the advanced cache indexing schemes are promising in that they significantly outperform the conventional cache indexing scheme. In addition, for a subset of cache-sensitive benchmarks, the adaptive indexing scheme substantially outperforms the static indexing scheme by effectively identifying and utilizing high-quality indexing bits based on runtime information. Finally, our evaluation shows that the effectiveness of advanced cache indexing is sensitive to different warp schedulers, motivating further research on coordinated cache indexing and warp scheduling techniques.