{"title":"Accelerating the SM3 hash algorithm with CPU-FPGA Co-Designed architecture","authors":"Xiaoying Huang, Zhichuan Guo, Mangu Song, Xuewen Zeng","doi":"10.1049/cdt2.12034","DOIUrl":null,"url":null,"abstract":"<p>SM3 hash algorithm developed by the Chinese Government is used in various fields of information security, and it is being widely used in commercial security products. However, the performance of implementation on the software architecture is not sufficient for high-speed applications. This study proposes a CPU-FPGA co-designed architecture which offloads the SM3 function on field-programmable gate array so that high throughput can be achieved. The architecture can execute the SM3 hash algorithm with 16 concurrent streams or more, which means that multiple data streams can be processed in parallel. This design is implemented on the Xilinx XCKU115-flva1517-2-e device and Dell commercial server, and the throughput of this design can reach up to 35.5 Gbps when 16 individual SM3 modules are processed in parallel. The proposed architecture results in an excellent performance in the CPU-FPGA-coupled environment.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"15 6","pages":"427-436"},"PeriodicalIF":1.1000,"publicationDate":"2021-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12034","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Computers and Digital Techniques","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cdt2.12034","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 2
Abstract
SM3 hash algorithm developed by the Chinese Government is used in various fields of information security, and it is being widely used in commercial security products. However, the performance of implementation on the software architecture is not sufficient for high-speed applications. This study proposes a CPU-FPGA co-designed architecture which offloads the SM3 function on field-programmable gate array so that high throughput can be achieved. The architecture can execute the SM3 hash algorithm with 16 concurrent streams or more, which means that multiple data streams can be processed in parallel. This design is implemented on the Xilinx XCKU115-flva1517-2-e device and Dell commercial server, and the throughput of this design can reach up to 35.5 Gbps when 16 individual SM3 modules are processed in parallel. The proposed architecture results in an excellent performance in the CPU-FPGA-coupled environment.
期刊介绍:
IET Computers & Digital Techniques publishes technical papers describing recent research and development work in all aspects of digital system-on-chip design and test of electronic and embedded systems, including the development of design automation tools (methodologies, algorithms and architectures). Papers based on the problems associated with the scaling down of CMOS technology are particularly welcome. It is aimed at researchers, engineers and educators in the fields of computer and digital systems design and test.
The key subject areas of interest are:
Design Methods and Tools: CAD/EDA tools, hardware description languages, high-level and architectural synthesis, hardware/software co-design, platform-based design, 3D stacking and circuit design, system on-chip architectures and IP cores, embedded systems, logic synthesis, low-power design and power optimisation.
Simulation, Test and Validation: electrical and timing simulation, simulation based verification, hardware/software co-simulation and validation, mixed-domain technology modelling and simulation, post-silicon validation, power analysis and estimation, interconnect modelling and signal integrity analysis, hardware trust and security, design-for-testability, embedded core testing, system-on-chip testing, on-line testing, automatic test generation and delay testing, low-power testing, reliability, fault modelling and fault tolerance.
Processor and System Architectures: many-core systems, general-purpose and application specific processors, computational arithmetic for DSP applications, arithmetic and logic units, cache memories, memory management, co-processors and accelerators, systems and networks on chip, embedded cores, platforms, multiprocessors, distributed systems, communication protocols and low-power issues.
Configurable Computing: embedded cores, FPGAs, rapid prototyping, adaptive computing, evolvable and statically and dynamically reconfigurable and reprogrammable systems, reconfigurable hardware.
Design for variability, power and aging: design methods for variability, power and aging aware design, memories, FPGAs, IP components, 3D stacking, energy harvesting.
Case Studies: emerging applications, applications in industrial designs, and design frameworks.