Lingjia Tang, Jason Mars, Neil Vachharajani, R. Hundt, M. Soffa
{"title":"内存子系统资源共享对数据中心应用程序的影响","authors":"Lingjia Tang, Jason Mars, Neil Vachharajani, R. Hundt, M. Soffa","doi":"10.1145/2000064.2000099","DOIUrl":null,"url":null,"abstract":"In this paper we study the impact of sharing memory resources on five Google datacenter applications: a web search engine, bigtable, content analyzer, image stitching, and protocol buffer. While prior work has found neither positive nor negative effects from cache sharing across the PARSEC benchmark suite, we find that across these datacenter applications, there is both a sizable benefit and a potential degradation from improperly sharing resources. In this paper, we first present a study of the importance of thread-to-core mappings for applications in the datacenter as threads can be mapped to share or to not share caches and bus bandwidth. Second, we investigate the impact of co-locating threads from multiple applications with diverse memory behavior and discover that the best mapping for a given application changes depending on its co-runner. Third, we investigate the application characteristics that impact performance in the various thread-to-core mapping scenarios. Finally, we present both a heuristics-based and an adaptive approach to arrive at good thread-to-core decisions in the datacenter. We observe performance swings of up to 25% for web search and 40% for other key applications, simply based on how application threads are mapped to cores. By employing our adaptive thread-to-core mapper, the performance of the datacenter applications presented in this work improved by up to 22% over status quo thread-to-core mapping and performs within 3% of optimal.","PeriodicalId":340732,"journal":{"name":"2011 38th Annual International Symposium on Computer Architecture (ISCA)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"238","resultStr":"{\"title\":\"The impact of memory subsystem resource sharing on datacenter applications\",\"authors\":\"Lingjia Tang, Jason Mars, Neil Vachharajani, R. Hundt, M. Soffa\",\"doi\":\"10.1145/2000064.2000099\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we study the impact of sharing memory resources on five Google datacenter applications: a web search engine, bigtable, content analyzer, image stitching, and protocol buffer. While prior work has found neither positive nor negative effects from cache sharing across the PARSEC benchmark suite, we find that across these datacenter applications, there is both a sizable benefit and a potential degradation from improperly sharing resources. In this paper, we first present a study of the importance of thread-to-core mappings for applications in the datacenter as threads can be mapped to share or to not share caches and bus bandwidth. Second, we investigate the impact of co-locating threads from multiple applications with diverse memory behavior and discover that the best mapping for a given application changes depending on its co-runner. Third, we investigate the application characteristics that impact performance in the various thread-to-core mapping scenarios. Finally, we present both a heuristics-based and an adaptive approach to arrive at good thread-to-core decisions in the datacenter. We observe performance swings of up to 25% for web search and 40% for other key applications, simply based on how application threads are mapped to cores. By employing our adaptive thread-to-core mapper, the performance of the datacenter applications presented in this work improved by up to 22% over status quo thread-to-core mapping and performs within 3% of optimal.\",\"PeriodicalId\":340732,\"journal\":{\"name\":\"2011 38th Annual International Symposium on Computer Architecture (ISCA)\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"238\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 38th Annual International Symposium on Computer Architecture (ISCA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2000064.2000099\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 38th Annual International Symposium on Computer Architecture (ISCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2000064.2000099","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The impact of memory subsystem resource sharing on datacenter applications
In this paper we study the impact of sharing memory resources on five Google datacenter applications: a web search engine, bigtable, content analyzer, image stitching, and protocol buffer. While prior work has found neither positive nor negative effects from cache sharing across the PARSEC benchmark suite, we find that across these datacenter applications, there is both a sizable benefit and a potential degradation from improperly sharing resources. In this paper, we first present a study of the importance of thread-to-core mappings for applications in the datacenter as threads can be mapped to share or to not share caches and bus bandwidth. Second, we investigate the impact of co-locating threads from multiple applications with diverse memory behavior and discover that the best mapping for a given application changes depending on its co-runner. Third, we investigate the application characteristics that impact performance in the various thread-to-core mapping scenarios. Finally, we present both a heuristics-based and an adaptive approach to arrive at good thread-to-core decisions in the datacenter. We observe performance swings of up to 25% for web search and 40% for other key applications, simply based on how application threads are mapped to cores. By employing our adaptive thread-to-core mapper, the performance of the datacenter applications presented in this work improved by up to 22% over status quo thread-to-core mapping and performs within 3% of optimal.