{"title":"Instruction Offloading with HMC 2.0 Standard: A Case Study for Graph Traversals","authors":"Lifeng Nai, Hyesoon Kim","doi":"10.1145/2818950.2818982","DOIUrl":null,"url":null,"abstract":"Processing in Memory (PIM) was first proposed decades ago for reducing the overhead of data movement between core and memory. With the advances in 3D-stacking technologies, recently PIM architectures have regained researchers' attentions. Several fully-programmable PIM architectures as well as programming models were proposed in previous literature. Meanwhile, memory industry also starts to integrate computation units into Hybrid Memory Cube (HMC). In HMC 2.0 specification, a number of atomic instructions are supported. Although the instruction support is limited, it enables us to offload computations at instruction granularity. In this paper, we present a preliminary study of instruction offloading on HMC 2.0 using graph traversals as an example. By demonstrating the programmability and performance benefits, we show the feasibility of an instruction-level offloading PIM architecture.","PeriodicalId":389462,"journal":{"name":"Proceedings of the 2015 International Symposium on Memory Systems","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 International Symposium on Memory Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2818950.2818982","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 32
Abstract
Processing in Memory (PIM) was first proposed decades ago for reducing the overhead of data movement between core and memory. With the advances in 3D-stacking technologies, recently PIM architectures have regained researchers' attentions. Several fully-programmable PIM architectures as well as programming models were proposed in previous literature. Meanwhile, memory industry also starts to integrate computation units into Hybrid Memory Cube (HMC). In HMC 2.0 specification, a number of atomic instructions are supported. Although the instruction support is limited, it enables us to offload computations at instruction granularity. In this paper, we present a preliminary study of instruction offloading on HMC 2.0 using graph traversals as an example. By demonstrating the programmability and performance benefits, we show the feasibility of an instruction-level offloading PIM architecture.