{"title":"使用数据转发的SHA2的三阶段管道实现","authors":"Anh-Tuan Hoang, K. Yamazaki, S. Oyanagi","doi":"10.1109/FPL.2008.4629903","DOIUrl":null,"url":null,"abstract":"The security hash algorithm 512 (SHA-512), which is used to verify the integrity of a message, involves computation iterations on data. The huge computation delay generated in that iteration limits the entire throughput of the system, and makes it difficult to pipeline the computation. To shorten the computation time in an iteration of the main loop, we used the data forwarding method. Here we introduce an architecture that simultaneously does data computation of an iteration and data movement of the next one. Then the computations are broken into two stages for one operand and three stages for another operand. The implementation occupies 1,520 hardware slices on Xilinx Virtex-4 family FPGA chip, and achieves nearly 2.2 Gbps. Thus, the implementation achieved a better area performance rate (throughput/area) in comparison with the related work.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"203 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Three-stage pipeline implementation for SHA2 using data forwarding\",\"authors\":\"Anh-Tuan Hoang, K. Yamazaki, S. Oyanagi\",\"doi\":\"10.1109/FPL.2008.4629903\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The security hash algorithm 512 (SHA-512), which is used to verify the integrity of a message, involves computation iterations on data. The huge computation delay generated in that iteration limits the entire throughput of the system, and makes it difficult to pipeline the computation. To shorten the computation time in an iteration of the main loop, we used the data forwarding method. Here we introduce an architecture that simultaneously does data computation of an iteration and data movement of the next one. Then the computations are broken into two stages for one operand and three stages for another operand. The implementation occupies 1,520 hardware slices on Xilinx Virtex-4 family FPGA chip, and achieves nearly 2.2 Gbps. Thus, the implementation achieved a better area performance rate (throughput/area) in comparison with the related work.\",\"PeriodicalId\":137963,\"journal\":{\"name\":\"2008 International Conference on Field Programmable Logic and Applications\",\"volume\":\"203 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 International Conference on Field Programmable Logic and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FPL.2008.4629903\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Conference on Field Programmable Logic and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FPL.2008.4629903","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Three-stage pipeline implementation for SHA2 using data forwarding
The security hash algorithm 512 (SHA-512), which is used to verify the integrity of a message, involves computation iterations on data. The huge computation delay generated in that iteration limits the entire throughput of the system, and makes it difficult to pipeline the computation. To shorten the computation time in an iteration of the main loop, we used the data forwarding method. Here we introduce an architecture that simultaneously does data computation of an iteration and data movement of the next one. Then the computations are broken into two stages for one operand and three stages for another operand. The implementation occupies 1,520 hardware slices on Xilinx Virtex-4 family FPGA chip, and achieves nearly 2.2 Gbps. Thus, the implementation achieved a better area performance rate (throughput/area) in comparison with the related work.