{"title":"AVX-512 Based Software Decoding for 5G LDPC Codes","authors":"Yi Xu, Wen Wang, Z. Xu, Xiqi Gao","doi":"10.1109/SiPS47522.2019.9020587","DOIUrl":null,"url":null,"abstract":"In this paper, we investigate how the 5G NR LDPC codes can be decoded by GPP effectively with single instruction-multiple-data (SIMD) acceleration and evaluate the corresponding achievable throughput on newly released Intel Xeon CPUs. Firstly, a general software implementation architecture with SIMD acceleration for horizontal-layered LDPC decoding is presented, where the parallelism can be achieved in an intra-block manner. By utilizing Intel advanced vector extended 512 (AVX-512) instruction set, the efficiency of parallelism are maximized and therefore the capacity of x86 processors can be fully exploited. In addition, new features of AVX-512 are further exploited to optimize load and store operations as well as preprocessing to reduce the operation cost. Experiments results also show that Intel Xeon Gold 6154 processors can achieve 42 to 272 Mbps throughput with a single core for ten layered decoding iterations for various code rate and block length. The typical processing latency is below 100 $\\mu s$. Consequently, an 18-core Intel Xeon CPU can achieve up to 5 Gbps decoding throughput.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"161 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SiPS47522.2019.9020587","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
In this paper, we investigate how the 5G NR LDPC codes can be decoded by GPP effectively with single instruction-multiple-data (SIMD) acceleration and evaluate the corresponding achievable throughput on newly released Intel Xeon CPUs. Firstly, a general software implementation architecture with SIMD acceleration for horizontal-layered LDPC decoding is presented, where the parallelism can be achieved in an intra-block manner. By utilizing Intel advanced vector extended 512 (AVX-512) instruction set, the efficiency of parallelism are maximized and therefore the capacity of x86 processors can be fully exploited. In addition, new features of AVX-512 are further exploited to optimize load and store operations as well as preprocessing to reduce the operation cost. Experiments results also show that Intel Xeon Gold 6154 processors can achieve 42 to 272 Mbps throughput with a single core for ten layered decoding iterations for various code rate and block length. The typical processing latency is below 100 $\mu s$. Consequently, an 18-core Intel Xeon CPU can achieve up to 5 Gbps decoding throughput.