Lossless Parallel Implementation of a Turbo Decoder on GPU

K. Natarajan, N. Chandrachoodan
{"title":"Lossless Parallel Implementation of a Turbo Decoder on GPU","authors":"K. Natarajan, N. Chandrachoodan","doi":"10.1109/HiPC.2018.00023","DOIUrl":null,"url":null,"abstract":"Turbo decoders use the recursive BCJR algorithm which is computationally intensive and hard to parallelise. The branch metric and extrinsic log-likelihood ratio computations are easily parallelisable, but the forward and backward metric computation is not parallelisable without compromising bit error rate. This paper proposes a lossless parallelisation technique for Turbo decoders on Graphics Processing Units (GPU). The recursive forward and backward metric computation is formulated as prefix (scan) matrix multiplication problem which is computed on the GPU using parallel prefix sum computation technique. Overall, this method achieves a throughput of 73 Mbps for a 3GPP LTE compliant turbo decoder without any BER loss and latency as low as 61 μs.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPC.2018.00023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Turbo decoders use the recursive BCJR algorithm which is computationally intensive and hard to parallelise. The branch metric and extrinsic log-likelihood ratio computations are easily parallelisable, but the forward and backward metric computation is not parallelisable without compromising bit error rate. This paper proposes a lossless parallelisation technique for Turbo decoders on Graphics Processing Units (GPU). The recursive forward and backward metric computation is formulated as prefix (scan) matrix multiplication problem which is computed on the GPU using parallel prefix sum computation technique. Overall, this method achieves a throughput of 73 Mbps for a 3GPP LTE compliant turbo decoder without any BER loss and latency as low as 61 μs.
Turbo解码器在GPU上的无损并行实现
Turbo解码器使用递归BCJR算法,该算法计算量大,难以并行化。分支度量和外在对数似然比计算容易并行化,但在不影响误码率的情况下,前向和后向度量计算不能并行化。提出了一种用于图形处理器(GPU)上Turbo解码器的无损并行化技术。将递归前向和后向度量计算表述为前缀(扫描)矩阵乘法问题,利用并行前缀和计算技术在GPU上进行计算。总体而言,该方法实现了符合3GPP LTE标准的涡轮解码器的73 Mbps吞吐量,没有任何误码率损失,延迟低至61 μs。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信