Arkaprava Basu, Sooraj Puthoor, Shuai Che, Bradford M. Beckmann
{"title":"异构处理器的软件辅助硬件缓存一致性","authors":"Arkaprava Basu, Sooraj Puthoor, Shuai Che, Bradford M. Beckmann","doi":"10.1145/2989081.2989092","DOIUrl":null,"url":null,"abstract":"Current trends suggest that future computing platforms will be increasingly heterogeneous. While these heterogeneous processors physically integrate disparate computing elements like CPUs and GPUs on a single chip, their programmability critically depends upon the ability to efficiently support cache coherence and shared virtual memory across tightly-integrated CPUs and GPUs. However, throughput-oriented GPUs easily overwhelm existing hardware coherence mechanisms that long kept the cache hierarchies in multi-core CPUs coherent. This paper proposes a novel solution called Software Assisted Hardware Coherence (SAHC) to scale cache coherence to future heterogeneous processors. We observe that the system software (Operating system and runtime) often has semantic knowledge about sharing patterns of data across the CPU and the GPU. This high-level knowledge can be utilized to effectively provide cache coherence across throughput-oriented GPUs and latency-sensitive CPUs in a heterogeneous processor. SAHC thus proposes a hybrid software-hardware mechanism that judiciously uses hardware coherence only when needed while using software's knowledge to filter out most of the unnecessary coherence traffic. Our evaluation suggests that SAHC can often eliminate up to 98-100% of the hardware coherence lookups, resulting up to 49% reduction in runtime.","PeriodicalId":283512,"journal":{"name":"Proceedings of the Second International Symposium on Memory Systems","volume":"447 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Software Assisted Hardware Cache Coherence for Heterogeneous Processors\",\"authors\":\"Arkaprava Basu, Sooraj Puthoor, Shuai Che, Bradford M. Beckmann\",\"doi\":\"10.1145/2989081.2989092\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Current trends suggest that future computing platforms will be increasingly heterogeneous. While these heterogeneous processors physically integrate disparate computing elements like CPUs and GPUs on a single chip, their programmability critically depends upon the ability to efficiently support cache coherence and shared virtual memory across tightly-integrated CPUs and GPUs. However, throughput-oriented GPUs easily overwhelm existing hardware coherence mechanisms that long kept the cache hierarchies in multi-core CPUs coherent. This paper proposes a novel solution called Software Assisted Hardware Coherence (SAHC) to scale cache coherence to future heterogeneous processors. We observe that the system software (Operating system and runtime) often has semantic knowledge about sharing patterns of data across the CPU and the GPU. This high-level knowledge can be utilized to effectively provide cache coherence across throughput-oriented GPUs and latency-sensitive CPUs in a heterogeneous processor. SAHC thus proposes a hybrid software-hardware mechanism that judiciously uses hardware coherence only when needed while using software's knowledge to filter out most of the unnecessary coherence traffic. Our evaluation suggests that SAHC can often eliminate up to 98-100% of the hardware coherence lookups, resulting up to 49% reduction in runtime.\",\"PeriodicalId\":283512,\"journal\":{\"name\":\"Proceedings of the Second International Symposium on Memory Systems\",\"volume\":\"447 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Second International Symposium on Memory Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2989081.2989092\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Second International Symposium on Memory Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2989081.2989092","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Software Assisted Hardware Cache Coherence for Heterogeneous Processors
Current trends suggest that future computing platforms will be increasingly heterogeneous. While these heterogeneous processors physically integrate disparate computing elements like CPUs and GPUs on a single chip, their programmability critically depends upon the ability to efficiently support cache coherence and shared virtual memory across tightly-integrated CPUs and GPUs. However, throughput-oriented GPUs easily overwhelm existing hardware coherence mechanisms that long kept the cache hierarchies in multi-core CPUs coherent. This paper proposes a novel solution called Software Assisted Hardware Coherence (SAHC) to scale cache coherence to future heterogeneous processors. We observe that the system software (Operating system and runtime) often has semantic knowledge about sharing patterns of data across the CPU and the GPU. This high-level knowledge can be utilized to effectively provide cache coherence across throughput-oriented GPUs and latency-sensitive CPUs in a heterogeneous processor. SAHC thus proposes a hybrid software-hardware mechanism that judiciously uses hardware coherence only when needed while using software's knowledge to filter out most of the unnecessary coherence traffic. Our evaluation suggests that SAHC can often eliminate up to 98-100% of the hardware coherence lookups, resulting up to 49% reduction in runtime.