{"title":"An efficient VLSI architecture for 2-D convolution with quadrant symmetric kernels","authors":"Ming Z. Zhang, H. T. Ngo, A. Livingston, V. Asari","doi":"10.1109/ISVLSI.2005.15","DOIUrl":null,"url":null,"abstract":"A high performance digital architecture for computing 2D convolution utilizing the quadrant symmetry of the kernels is proposed in this paper. Pixels in the four quadrants of the kernel region with respect to an image pixel are considered simultaneously for computing the partial products of the convolution sum. A novel data handling strategy to identify the pixels to be fed to different processing elements helps reducing the data storage requirements in the circuitry. The new design results in 75% reduction in multipliers and 50% reduction in adders when compared with the conventional systolic architecture. The proposed architecture design is capable of performing convolution operations with 14/spl times/14 kernel at a rate of 57 1024/spl times/1024 frames per second in a Xilinx 's Virtex 2v2000ff896-4 FPGA.","PeriodicalId":158790,"journal":{"name":"IEEE Computer Society Annual Symposium on VLSI: New Frontiers in VLSI Design (ISVLSI'05)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Computer Society Annual Symposium on VLSI: New Frontiers in VLSI Design (ISVLSI'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISVLSI.2005.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
A high performance digital architecture for computing 2D convolution utilizing the quadrant symmetry of the kernels is proposed in this paper. Pixels in the four quadrants of the kernel region with respect to an image pixel are considered simultaneously for computing the partial products of the convolution sum. A novel data handling strategy to identify the pixels to be fed to different processing elements helps reducing the data storage requirements in the circuitry. The new design results in 75% reduction in multipliers and 50% reduction in adders when compared with the conventional systolic architecture. The proposed architecture design is capable of performing convolution operations with 14/spl times/14 kernel at a rate of 57 1024/spl times/1024 frames per second in a Xilinx 's Virtex 2v2000ff896-4 FPGA.