基于gpu的高效海量数据接收加速ECC解码单元

J. Inf. Process. Syst. Pub Date : 2020-12-01 DOI:10.3745/JIPS.01.0060

Jisu Kwon, Moon Gi Seok, Daejin Park

{"title":"基于gpu的高效海量数据接收加速ECC解码单元","authors":"Jisu Kwon, Moon Gi Seok, Daejin Park","doi":"10.3745/JIPS.01.0060","DOIUrl":null,"url":null,"abstract":"In transmitting and receiving such a large amount of data, reliable data communication is crucial for normal operation of a device and to prevent abnormal operations caused by errors. Therefore, in this paper, it is assumed that an error correction code (ECC) that can detect and correct errors by itself is used in an environment where massive data is sequentially received. Because an embedded system has limited resources, such as a low-performance processor or a small memory, it requires efficient operation of applications. In this paper, we propose using an accelerated ECC-decoding technique with a graphics processing unit (GPU) built into the embedded system when receiving a large amount of data. In the matrix–vector multiplication that forms the Hamming code used as a function of the ECC operation, the matrix is expressed in compressed sparse row (CSR) format, and a sparse matrix–vector product is used. The multiplication operation is performed in the kernel of the GPU, and we also accelerate the Hamming code computation so that the ECC operation can be performed in parallel. The proposed technique is implemented with CUDA on a GPU-embedded target board, NVIDIA Jetson TX2, and compared with execution time of the CPU.","PeriodicalId":415161,"journal":{"name":"J. Inf. Process. Syst.","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"GPU-Based ECC Decode Unit for Efficient Massive Data Reception Acceleration\",\"authors\":\"Jisu Kwon, Moon Gi Seok, Daejin Park\",\"doi\":\"10.3745/JIPS.01.0060\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In transmitting and receiving such a large amount of data, reliable data communication is crucial for normal operation of a device and to prevent abnormal operations caused by errors. Therefore, in this paper, it is assumed that an error correction code (ECC) that can detect and correct errors by itself is used in an environment where massive data is sequentially received. Because an embedded system has limited resources, such as a low-performance processor or a small memory, it requires efficient operation of applications. In this paper, we propose using an accelerated ECC-decoding technique with a graphics processing unit (GPU) built into the embedded system when receiving a large amount of data. In the matrix–vector multiplication that forms the Hamming code used as a function of the ECC operation, the matrix is expressed in compressed sparse row (CSR) format, and a sparse matrix–vector product is used. The multiplication operation is performed in the kernel of the GPU, and we also accelerate the Hamming code computation so that the ECC operation can be performed in parallel. The proposed technique is implemented with CUDA on a GPU-embedded target board, NVIDIA Jetson TX2, and compared with execution time of the CPU.\",\"PeriodicalId\":415161,\"journal\":{\"name\":\"J. Inf. Process. Syst.\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"J. Inf. Process. Syst.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3745/JIPS.01.0060\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Inf. Process. Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3745/JIPS.01.0060","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

在传输和接收如此大的数据量的情况下，可靠的数据通信对于设备的正常运行和防止由于错误导致的异常运行至关重要。因此，本文假设在连续接收海量数据的环境中，使用一种能够自行检测和纠错的纠错码(error correction code, ECC)。由于嵌入式系统的资源有限，例如低性能处理器或小内存，因此要求应用程序的高效运行。在本文中，我们建议在接收大量数据时，使用嵌入式系统内置图形处理单元(GPU)的加速ecc解码技术。在形成汉明码的矩阵-向量乘法中，作为ECC操作的函数，矩阵以压缩稀疏行(CSR)格式表示，并使用稀疏矩阵-向量积。乘法运算是在GPU的内核中进行的，我们还加速了汉明码的计算，使ECC运算可以并行执行。采用CUDA技术在NVIDIA Jetson TX2嵌入式gpu目标板上实现了该技术，并与CPU的执行时间进行了比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

GPU-Based ECC Decode Unit for Efficient Massive Data Reception Acceleration

In transmitting and receiving such a large amount of data, reliable data communication is crucial for normal operation of a device and to prevent abnormal operations caused by errors. Therefore, in this paper, it is assumed that an error correction code (ECC) that can detect and correct errors by itself is used in an environment where massive data is sequentially received. Because an embedded system has limited resources, such as a low-performance processor or a small memory, it requires efficient operation of applications. In this paper, we propose using an accelerated ECC-decoding technique with a graphics processing unit (GPU) built into the embedded system when receiving a large amount of data. In the matrix–vector multiplication that forms the Hamming code used as a function of the ECC operation, the matrix is expressed in compressed sparse row (CSR) format, and a sparse matrix–vector product is used. The multiplication operation is performed in the kernel of the GPU, and we also accelerate the Hamming code computation so that the ECC operation can be performed in parallel. The proposed technique is implemented with CUDA on a GPU-embedded target board, NVIDIA Jetson TX2, and compared with execution time of the CPU.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

J. Inf. Process. Syst.

自引率

0.00%

发文量