Pietro Pennestri, Yanqiu Huang, Nikolaos S. Alachiotis
{"title":"A novel approximation scheme for floating-point square root and inverse square root for FPGAs","authors":"Pietro Pennestri, Yanqiu Huang, Nikolaos S. Alachiotis","doi":"10.1109/mocast54814.2022.9837550","DOIUrl":null,"url":null,"abstract":"Jointly computing the square root (SQRT) and the inverse square root (ISQRT) of floating-point numbers is common in many algorithms, e.g., in image or time series data processing when computing norms or vector normalization. Existing designs suffer from high latency and inefficient resource utilization due to the separate architectures that carry out these two operations. In this paper, we first propose a non-iterative approximation method for computing SQRT and ISQRT based on the Chebyshev min-max criterion to reduce the latency while meeting the accuracy requirements of various applications; thereafter a shared architecture of these two operations is designed and implemented in FPGA with less logic units. In contrast with other approximation solutions, our method does not need to perform any iterations and the accuracy can be mathematically estimated. A comparison with vendor-provided IP cores for FPGAs revealed that our proposed SQRT/ISQRT floating-point IP core utilizes less resources while reducing the clock-cycle latency by nearly four times.","PeriodicalId":122414,"journal":{"name":"2022 11th International Conference on Modern Circuits and Systems Technologies (MOCAST)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 11th International Conference on Modern Circuits and Systems Technologies (MOCAST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/mocast54814.2022.9837550","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Jointly computing the square root (SQRT) and the inverse square root (ISQRT) of floating-point numbers is common in many algorithms, e.g., in image or time series data processing when computing norms or vector normalization. Existing designs suffer from high latency and inefficient resource utilization due to the separate architectures that carry out these two operations. In this paper, we first propose a non-iterative approximation method for computing SQRT and ISQRT based on the Chebyshev min-max criterion to reduce the latency while meeting the accuracy requirements of various applications; thereafter a shared architecture of these two operations is designed and implemented in FPGA with less logic units. In contrast with other approximation solutions, our method does not need to perform any iterations and the accuracy can be mathematically estimated. A comparison with vendor-provided IP cores for FPGAs revealed that our proposed SQRT/ISQRT floating-point IP core utilizes less resources while reducing the clock-cycle latency by nearly four times.