{"title":"低功耗除法:基数4,8和16的实现比较","authors":"A. Nannarelli, T. Lang","doi":"10.1109/ARITH.1999.762829","DOIUrl":null,"url":null,"abstract":"Although division is less frequent than addition and multiplication, because of its longer latency it dissipates a substantial part of the energy in floating-point units. In this paper we explore the relation between the radix and the energy dissipated. Previous work has been done an radix-4 and radix-8 division. Here we extend this study to a radix-4 scheme with two overlapped radix-4 stages and compare the latency, area, and energy of the three implementations. Results show that by applying the low-power techniques the energy dissipation is reduced from 30% to 40%, with respect to the standard implementation. An additional 20% reduction can be obtained using a dual voltage. Moreover the energy dissipated to complete the division is roughly the same for the three radices. However, the power dissipation, proportional to the average current, increases with the radix. If reducing the energy is the priority, for the same latency radix-16 with dual voltage produces the smallest energy dissipation.","PeriodicalId":434169,"journal":{"name":"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Low-power division: comparison among implementations of radix 4, 8 and 16\",\"authors\":\"A. Nannarelli, T. Lang\",\"doi\":\"10.1109/ARITH.1999.762829\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although division is less frequent than addition and multiplication, because of its longer latency it dissipates a substantial part of the energy in floating-point units. In this paper we explore the relation between the radix and the energy dissipated. Previous work has been done an radix-4 and radix-8 division. Here we extend this study to a radix-4 scheme with two overlapped radix-4 stages and compare the latency, area, and energy of the three implementations. Results show that by applying the low-power techniques the energy dissipation is reduced from 30% to 40%, with respect to the standard implementation. An additional 20% reduction can be obtained using a dual voltage. Moreover the energy dissipated to complete the division is roughly the same for the three radices. However, the power dissipation, proportional to the average current, increases with the radix. If reducing the energy is the priority, for the same latency radix-16 with dual voltage produces the smallest energy dissipation.\",\"PeriodicalId\":434169,\"journal\":{\"name\":\"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1999-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ARITH.1999.762829\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARITH.1999.762829","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Low-power division: comparison among implementations of radix 4, 8 and 16
Although division is less frequent than addition and multiplication, because of its longer latency it dissipates a substantial part of the energy in floating-point units. In this paper we explore the relation between the radix and the energy dissipated. Previous work has been done an radix-4 and radix-8 division. Here we extend this study to a radix-4 scheme with two overlapped radix-4 stages and compare the latency, area, and energy of the three implementations. Results show that by applying the low-power techniques the energy dissipation is reduced from 30% to 40%, with respect to the standard implementation. An additional 20% reduction can be obtained using a dual voltage. Moreover the energy dissipated to complete the division is roughly the same for the three radices. However, the power dissipation, proportional to the average current, increases with the radix. If reducing the energy is the priority, for the same latency radix-16 with dual voltage produces the smallest energy dissipation.