Pietro Pennestri, Yanqiu Huang, Nikolaos S. Alachiotis
{"title":"一种新的fpga浮点平方根和反平方根近似格式","authors":"Pietro Pennestri, Yanqiu Huang, Nikolaos S. Alachiotis","doi":"10.1109/mocast54814.2022.9837550","DOIUrl":null,"url":null,"abstract":"Jointly computing the square root (SQRT) and the inverse square root (ISQRT) of floating-point numbers is common in many algorithms, e.g., in image or time series data processing when computing norms or vector normalization. Existing designs suffer from high latency and inefficient resource utilization due to the separate architectures that carry out these two operations. In this paper, we first propose a non-iterative approximation method for computing SQRT and ISQRT based on the Chebyshev min-max criterion to reduce the latency while meeting the accuracy requirements of various applications; thereafter a shared architecture of these two operations is designed and implemented in FPGA with less logic units. In contrast with other approximation solutions, our method does not need to perform any iterations and the accuracy can be mathematically estimated. A comparison with vendor-provided IP cores for FPGAs revealed that our proposed SQRT/ISQRT floating-point IP core utilizes less resources while reducing the clock-cycle latency by nearly four times.","PeriodicalId":122414,"journal":{"name":"2022 11th International Conference on Modern Circuits and Systems Technologies (MOCAST)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A novel approximation scheme for floating-point square root and inverse square root for FPGAs\",\"authors\":\"Pietro Pennestri, Yanqiu Huang, Nikolaos S. Alachiotis\",\"doi\":\"10.1109/mocast54814.2022.9837550\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Jointly computing the square root (SQRT) and the inverse square root (ISQRT) of floating-point numbers is common in many algorithms, e.g., in image or time series data processing when computing norms or vector normalization. Existing designs suffer from high latency and inefficient resource utilization due to the separate architectures that carry out these two operations. In this paper, we first propose a non-iterative approximation method for computing SQRT and ISQRT based on the Chebyshev min-max criterion to reduce the latency while meeting the accuracy requirements of various applications; thereafter a shared architecture of these two operations is designed and implemented in FPGA with less logic units. In contrast with other approximation solutions, our method does not need to perform any iterations and the accuracy can be mathematically estimated. A comparison with vendor-provided IP cores for FPGAs revealed that our proposed SQRT/ISQRT floating-point IP core utilizes less resources while reducing the clock-cycle latency by nearly four times.\",\"PeriodicalId\":122414,\"journal\":{\"name\":\"2022 11th International Conference on Modern Circuits and Systems Technologies (MOCAST)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 11th International Conference on Modern Circuits and Systems Technologies (MOCAST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/mocast54814.2022.9837550\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 11th International Conference on Modern Circuits and Systems Technologies (MOCAST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/mocast54814.2022.9837550","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A novel approximation scheme for floating-point square root and inverse square root for FPGAs
Jointly computing the square root (SQRT) and the inverse square root (ISQRT) of floating-point numbers is common in many algorithms, e.g., in image or time series data processing when computing norms or vector normalization. Existing designs suffer from high latency and inefficient resource utilization due to the separate architectures that carry out these two operations. In this paper, we first propose a non-iterative approximation method for computing SQRT and ISQRT based on the Chebyshev min-max criterion to reduce the latency while meeting the accuracy requirements of various applications; thereafter a shared architecture of these two operations is designed and implemented in FPGA with less logic units. In contrast with other approximation solutions, our method does not need to perform any iterations and the accuracy can be mathematically estimated. A comparison with vendor-provided IP cores for FPGAs revealed that our proposed SQRT/ISQRT floating-point IP core utilizes less resources while reducing the clock-cycle latency by nearly four times.