Ke Zhang, Yisong Chang, Lixin Zhang, Mingyu Chen, Lei Yu, Zhiwei Xu
{"title":"sAXI: A High-Efficient Hardware Inter-Node Link in ARM Server for Remote Memory Access","authors":"Ke Zhang, Yisong Chang, Lixin Zhang, Mingyu Chen, Lei Yu, Zhiwei Xu","doi":"10.1109/CCGrid.2016.66","DOIUrl":null,"url":null,"abstract":"The ever-growing need for fast big-data operations has made in-memory processing increasingly important in modern datacenters. To mitigate the capacity limitation of a single server node, techniques of inner-rack cross-node memory access have drawn attention recently. However, existing proposals exhibit inefficiency in remote memory access among server nodes due to inter-protocol conversions and non-transparent coarse-grained accesses. In this study, we propose the high-performance and efficient serialized AXI (sAXI) link and its associated cross-node memory access mechanism for emerging ARM-based servers. The key idea behind sAXI is directly extending the on-chip AMBA AXI-4.0 interconnection of the SoC in a local server node to the outside, and then bringing into remote server nodes via high-speed serial lanes. As a result, natively accessing remote memory in adjacent nodes in the same manner of local assets is supported by purely using existing software. Experimental results show that, using the sAXI data-path, performance of remote memory access in the user-level micro-benchmark is very promising (min. latency: 1.16μs, max. bandwidth: 1.52GB/s on our in-house FPGA prototype). In addition, through this efficient hardware inter-node link, performance of an in-memory key-value framework, Redis, can be improved up to 1.72x and large latency overhead of database query can be effectively hidden.","PeriodicalId":103641,"journal":{"name":"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGrid.2016.66","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
The ever-growing need for fast big-data operations has made in-memory processing increasingly important in modern datacenters. To mitigate the capacity limitation of a single server node, techniques of inner-rack cross-node memory access have drawn attention recently. However, existing proposals exhibit inefficiency in remote memory access among server nodes due to inter-protocol conversions and non-transparent coarse-grained accesses. In this study, we propose the high-performance and efficient serialized AXI (sAXI) link and its associated cross-node memory access mechanism for emerging ARM-based servers. The key idea behind sAXI is directly extending the on-chip AMBA AXI-4.0 interconnection of the SoC in a local server node to the outside, and then bringing into remote server nodes via high-speed serial lanes. As a result, natively accessing remote memory in adjacent nodes in the same manner of local assets is supported by purely using existing software. Experimental results show that, using the sAXI data-path, performance of remote memory access in the user-level micro-benchmark is very promising (min. latency: 1.16μs, max. bandwidth: 1.52GB/s on our in-house FPGA prototype). In addition, through this efficient hardware inter-node link, performance of an in-memory key-value framework, Redis, can be improved up to 1.72x and large latency overhead of database query can be effectively hidden.