Tianyu Bai, Spencer Davis, Juanjuan Li, Ying Gu, Hai Jiang
{"title":"加速NTRU加密与图形处理单元","authors":"Tianyu Bai, Spencer Davis, Juanjuan Li, Ying Gu, Hai Jiang","doi":"10.2991/ijndc.2014.2.4.6","DOIUrl":null,"url":null,"abstract":"Lattice based cryptography is attractive for its quantum computing resistance and efficient encryption/decryption process. However, the Big Data issue has perplexed most lattice based cryptographic systems since the overall processing is slowed down too much. This paper intends to analyze one of the major lattice-based cryptographic systems, Nth-degree truncated polynomial ring (NTRU), and accelerate its execution with Graphic Processing Unit (GPU) for acceptable processing speed. Three strategies, including single GPU with zero copy, single GPU with data transfer, and multi-GPU versions are proposed for performance comparison. GPU computing techniques such as stream and zero copy are applied to overlap computations and communications for possible speedup. Experimental results have demonstrated the effectiveness of GPU acceleration of NTRU. As the number of involved devices increases, better NTRU performance will be achieved.","PeriodicalId":318936,"journal":{"name":"Int. J. Networked Distributed Comput.","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Accelerating NTRU Encryption with Graphics Processing Units\",\"authors\":\"Tianyu Bai, Spencer Davis, Juanjuan Li, Ying Gu, Hai Jiang\",\"doi\":\"10.2991/ijndc.2014.2.4.6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Lattice based cryptography is attractive for its quantum computing resistance and efficient encryption/decryption process. However, the Big Data issue has perplexed most lattice based cryptographic systems since the overall processing is slowed down too much. This paper intends to analyze one of the major lattice-based cryptographic systems, Nth-degree truncated polynomial ring (NTRU), and accelerate its execution with Graphic Processing Unit (GPU) for acceptable processing speed. Three strategies, including single GPU with zero copy, single GPU with data transfer, and multi-GPU versions are proposed for performance comparison. GPU computing techniques such as stream and zero copy are applied to overlap computations and communications for possible speedup. Experimental results have demonstrated the effectiveness of GPU acceleration of NTRU. As the number of involved devices increases, better NTRU performance will be achieved.\",\"PeriodicalId\":318936,\"journal\":{\"name\":\"Int. J. Networked Distributed Comput.\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Networked Distributed Comput.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2991/ijndc.2014.2.4.6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Networked Distributed Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2991/ijndc.2014.2.4.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Accelerating NTRU Encryption with Graphics Processing Units
Lattice based cryptography is attractive for its quantum computing resistance and efficient encryption/decryption process. However, the Big Data issue has perplexed most lattice based cryptographic systems since the overall processing is slowed down too much. This paper intends to analyze one of the major lattice-based cryptographic systems, Nth-degree truncated polynomial ring (NTRU), and accelerate its execution with Graphic Processing Unit (GPU) for acceptable processing speed. Three strategies, including single GPU with zero copy, single GPU with data transfer, and multi-GPU versions are proposed for performance comparison. GPU computing techniques such as stream and zero copy are applied to overlap computations and communications for possible speedup. Experimental results have demonstrated the effectiveness of GPU acceleration of NTRU. As the number of involved devices increases, better NTRU performance will be achieved.