AES Encryption Implementation on CUDA GPU and Its Analysis

Keisuke Iwai, T. Kurokawa, Naoki Nishikawa
{"title":"AES Encryption Implementation on CUDA GPU and Its Analysis","authors":"Keisuke Iwai, T. Kurokawa, Naoki Nishikawa","doi":"10.1109/IC-NC.2010.49","DOIUrl":null,"url":null,"abstract":"GPU has a good performance ratio and exhibits the capability for applications with high level of parallelism despite its inexpensive price. The support of integer and logical instructions on the latest generation of GPU makes us to implement cipher algorithms easier with the same instructions. However the decisions such as parallel processing granularity or memory allocation place imposed heavy burden on programmers. For this reason this paper shows the results of several experiments to study relation between memory allocation style of AES parameters and granularity as the parallelism exploited from AES encoding process using CUDA with NVIDIA Geforce GTX285. The result of experiments cleared up that the 16Byte/thread granularity had the highest performance and it achieved approximately 35Gbps throughput. Moreover, implementation with overlapping between processing and data transfer brought up 22.5Gbps throughput including data transfer time. Also, it cleared up that it is important to decide granularity and memory allocation to effective processing in AES encryption on GPU.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"181 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"54","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 First International Conference on Networking and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC-NC.2010.49","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 54

Abstract

GPU has a good performance ratio and exhibits the capability for applications with high level of parallelism despite its inexpensive price. The support of integer and logical instructions on the latest generation of GPU makes us to implement cipher algorithms easier with the same instructions. However the decisions such as parallel processing granularity or memory allocation place imposed heavy burden on programmers. For this reason this paper shows the results of several experiments to study relation between memory allocation style of AES parameters and granularity as the parallelism exploited from AES encoding process using CUDA with NVIDIA Geforce GTX285. The result of experiments cleared up that the 16Byte/thread granularity had the highest performance and it achieved approximately 35Gbps throughput. Moreover, implementation with overlapping between processing and data transfer brought up 22.5Gbps throughput including data transfer time. Also, it cleared up that it is important to decide granularity and memory allocation to effective processing in AES encryption on GPU.
AES加密在CUDA GPU上的实现及分析
GPU具有良好的性能价格比,在价格低廉的情况下,也能表现出高水平并行应用的能力。最新一代GPU对整数指令和逻辑指令的支持,使我们在使用相同指令的情况下更容易实现密码算法。然而,诸如并行处理粒度或内存分配位置等决策给程序员带来了沉重的负担。为此,本文给出了几个实验的结果,研究AES参数的内存分配方式与粒度之间的关系,以及在NVIDIA Geforce GTX285的CUDA上利用AES编码过程的并行性。实验结果表明,16Byte/thread粒度具有最高的性能,它实现了大约35Gbps的吞吐量。此外,在处理和数据传输之间重叠的实现带来了22.5Gbps的吞吐量,包括数据传输时间。同时,本文还明确了在GPU上决定AES加密的粒度和内存分配对有效处理的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信