Mieszko Lis, Keun Sup Shim, Myong Hyon Cho, S. Devadas
{"title":"Memory coherence in the age of multicores","authors":"Mieszko Lis, Keun Sup Shim, Myong Hyon Cho, S. Devadas","doi":"10.1109/ICCD.2011.6081367","DOIUrl":null,"url":null,"abstract":"As we enter an era of exascale multicores, the question of efficiently supporting a shared memory model has become of paramount importance. On the one hand, programmers demand the convenience of coherent shared memory; on the other, growing core counts place higher demands on the memory subsystem and increasing on-chip distances mean that interconnect delays are becoming a significant part of memory access latencies. In this article, we first review the traditional techniques for providing a shared memory abstraction at the hardware level in multicore systems. We describe two new schemes that guarantee coherent shared memory without the complexity and overheads of a cache coherence protocol, namely execution migration and library cache coherence. We compare these approaches using an analytical model based on average memory latency, and give intuition for the strengths and weaknesses of each. Finally, we describe hybrid schemes that combine the strengths of different schemes.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 29th International Conference on Computer Design (ICCD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.2011.6081367","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
As we enter an era of exascale multicores, the question of efficiently supporting a shared memory model has become of paramount importance. On the one hand, programmers demand the convenience of coherent shared memory; on the other, growing core counts place higher demands on the memory subsystem and increasing on-chip distances mean that interconnect delays are becoming a significant part of memory access latencies. In this article, we first review the traditional techniques for providing a shared memory abstraction at the hardware level in multicore systems. We describe two new schemes that guarantee coherent shared memory without the complexity and overheads of a cache coherence protocol, namely execution migration and library cache coherence. We compare these approaches using an analytical model based on average memory latency, and give intuition for the strengths and weaknesses of each. Finally, we describe hybrid schemes that combine the strengths of different schemes.