Power comparison of flow-graph and distributed arithmetic based DCT architectures

Conference Record of Thirty-Second Asilomar Conference on Signals, Systems and Computers (Cat. No.98CH36284) Pub Date : 1998-12-01 DOI:10.1109/ACSSC.1998.751519

M. Kuhlmann, K. Parhi

{"title":"Power comparison of flow-graph and distributed arithmetic based DCT architectures","authors":"M. Kuhlmann, K. Parhi","doi":"10.1109/ACSSC.1998.751519","DOIUrl":null,"url":null,"abstract":"The discrete cosine transform (DCT) is widely used in image and video compression systems. Two popular approaches to implementation of DCT algorithms include use of distributed arithmetic and flow-graphs based on fast algorithms. The distributed arithmetic architectures (DAA) have been widely used in many system implementations, due to their low latency and area requirements. However, no systematic study of power, area and latency tradeoffs of the DAA and the FGA have been studied. This paper presents a systematic study of area, latency and power consumption of these two alternate architectures. It is concluded that the flow-graph architecture consumes about 39% less power compared to the distributed arithmetic architecture, at the expenses of 28% more area and a 3.75 times increase in latency. Alternatively, by reducing the level of pipelining in the flowgraph architecture the implementation consumes 13% less power, at the expense of 20% more area and a tow times increase in latency. These results have been obtained by estimating the power consumption on actual layouts with effects of parasitic capacitance included as opposed to estimation of power consumption on schematics.","PeriodicalId":393743,"journal":{"name":"Conference Record of Thirty-Second Asilomar Conference on Signals, Systems and Computers (Cat. No.98CH36284)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference Record of Thirty-Second Asilomar Conference on Signals, Systems and Computers (Cat. No.98CH36284)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACSSC.1998.751519","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

Abstract

The discrete cosine transform (DCT) is widely used in image and video compression systems. Two popular approaches to implementation of DCT algorithms include use of distributed arithmetic and flow-graphs based on fast algorithms. The distributed arithmetic architectures (DAA) have been widely used in many system implementations, due to their low latency and area requirements. However, no systematic study of power, area and latency tradeoffs of the DAA and the FGA have been studied. This paper presents a systematic study of area, latency and power consumption of these two alternate architectures. It is concluded that the flow-graph architecture consumes about 39% less power compared to the distributed arithmetic architecture, at the expenses of 28% more area and a 3.75 times increase in latency. Alternatively, by reducing the level of pipelining in the flowgraph architecture the implementation consumes 13% less power, at the expense of 20% more area and a tow times increase in latency. These results have been obtained by estimating the power consumption on actual layouts with effects of parasitic capacitance included as opposed to estimation of power consumption on schematics.

查看原文本刊更多论文

基于流图和分布式算法的DCT体系结构的功率比较

离散余弦变换(DCT)广泛应用于图像和视频压缩系统。实现DCT算法的两种流行方法包括使用分布式算法和基于快速算法的流程图。分布式算法体系结构(DAA)由于其低延迟和低面积要求，在许多系统实现中得到了广泛的应用。然而，对于DAA和FGA在功率、面积和延迟方面的权衡还没有系统的研究。本文对这两种架构的面积、延迟和功耗进行了系统的研究。结果表明，与分布式算法架构相比，流图架构的功耗降低了39%，而面积增加了28%，延迟增加了3.75倍。另外，通过减少流图架构中的流水线级别，实现可以减少13%的功耗，但代价是面积增加20%，延迟增加两倍。这些结果是通过估计实际布局的功耗而得到的，其中包括寄生电容的影响，而不是在原理图上估计功耗。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Conference Record of Thirty-Second Asilomar Conference on Signals, Systems and Computers (Cat. No.98CH36284)

自引率

0.00%

发文量