{"title":"A high speed Reed-Solomon decoder","authors":"M. Lee, S. Choi, J. Chang","doi":"10.1109/VLSISP.1995.527507","DOIUrl":"https://doi.org/10.1109/VLSISP.1995.527507","url":null,"abstract":"The Reed Solomon Code (RS Code) has been widely employed in digital audio/video equipment such as CD, DAT, and professional video recorders owing to its excellent capability for correcting burst errors and relatively easy implementation. We proposed an architecture of an error correction circuit suitable for high-rate data decoding of the Reed-Solomon Code. The circuit has encoder and decoder functions by 2 symbol random error correction as well as 4 symbol erasure correction. And 23 MByte/s rate of data decoding is sufficient for compressed video signals of high-definition as well as those of stand-definitions TVs.","PeriodicalId":286121,"journal":{"name":"VLSI Signal Processing, VIII","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126096216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A video codec chipset for wireless multimedia networking","authors":"S. Molloy, R. Jain, K. Nishibori","doi":"10.1109/VLSISP.1995.527509","DOIUrl":"https://doi.org/10.1109/VLSISP.1995.527509","url":null,"abstract":"This paper describes a three-chip encoder/decoder for two-way video communications in a low-bit rate wireless environment. The chipset performs 2D and 3D wavelet transforms, scalar quantization, run-length/Huffman coding and bitstream formatting/parsing with fewer than 250 K total transistors. The codec provides high-quality video at bitrates between several megabits per second down to tens of kilobits per second, with a power consumption an order of magnitude lower than existing codecs. Robust decoding of corrupted bitstreams is enabled with a hierarchy of synchronization codewords and error concealment. The three chips include a programmable wavelet transform processor, a subband decoding processor and a subband encoding processor, occupying 36 mm/sup 2/, 42 mm/sup 2/ and 47 mm/sup 2/, respectively, in 1.2-micron CMOS.","PeriodicalId":286121,"journal":{"name":"VLSI Signal Processing, VIII","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122806312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A programmable motion estimator for a class of hierarchical algorithms","authors":"Horng-Dar Lin, A. Anesko, B. Petryna, G. Pavlovic","doi":"10.1109/VLSISP.1995.527512","DOIUrl":"https://doi.org/10.1109/VLSISP.1995.527512","url":null,"abstract":"Future generations of video codecs need programmable video processing capabilities to extend the range of applications. A key component of the video codec design is the motion estimator. Because of its high computational requirements, a programmable motion estimator design must carefully balance its programmability and cost-effectiveness. In this paper we propose a distributed programmable motion estimator architecture and optimize its processing engine for hardware efficiency and the control engine for flexibility. The distributed architecture models motion estimation as searching through a tree of vector points, where the traversing functions are implemented in a multi-mode vector search engine and the hierarchy is constructed by an algorithm controller with flexible DMA transfers. Based on the distributed architecture a programmable motion estimator is implemented within a 0.5 /spl mu/m CMOS video codec for multiple video standards. The programmable motion estimator achieves near full search quality (degradation less than 0.1 dB for CIF 30 f/s H.261) with only 1/4 of processing and memory resources and can be reused for H.26X and MPEG coding across a wide range of resolution and video quality tradeoff points.","PeriodicalId":286121,"journal":{"name":"VLSI Signal Processing, VIII","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122478406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal parenthesization of lexical orderings for DSP block diagrams","authors":"S. Bhattacharyya, P. Murthy, Edward A. Lee","doi":"10.1109/VLSISP.1995.527489","DOIUrl":"https://doi.org/10.1109/VLSISP.1995.527489","url":null,"abstract":"Minimizing memory requirements for program and data are critical objectives when synthesizing software for embedded DSP applications. Previously, it has been demonstrated that for graphical programs based on the widely-used synchronous dataflow model an important class of minimum code size implementations can be viewed as parenthesizations of lexical orderings of the computational blocks. Such a parenthesization corresponds to the hierarchy of loops in the software implementation. In this paper, we present a dynamic programming technique for constructing a parenthesization that minimizes data memory cost from a given lexical ordering of a synchronous dataflow graph. For graphs that do not contain delays, this technique always constructs a parenthesization that has minimum data memory cost from among all parenthesizations for the given lexical ordering. When delays are present, the technique may make refinements to the lexical ordering while it is computing the parenthesization, and the data memory cost of the result is guaranteed to be less than or equal to the data memory cost of all valid parenthesizations for the initial lexical ordering.","PeriodicalId":286121,"journal":{"name":"VLSI Signal Processing, VIII","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128519647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Constant scale factor, on-line CORDIC algorithm in the circular coordinate system","authors":"R. Hamill, J. McCanny, R. Walke","doi":"10.1109/VLSISP.1995.527527","DOIUrl":"https://doi.org/10.1109/VLSISP.1995.527527","url":null,"abstract":"Real time digital signal processing requires the development of high performance arithmetic algorithms suitable for VLSI design. In this paper, a new online, circular coordinate system CORDIC algorithm is described, which has a constant scale factor. This algorithm was developed using a new Angular Representation (AR) model. A radix 2 version of the CORDIC algorithm is presented, along with an architecture suitable for VLSI implementation.","PeriodicalId":286121,"journal":{"name":"VLSI Signal Processing, VIII","volume":"31 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133293699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A VLSI design for a real-time video decoder","authors":"Y. Chan, S. Kwong, K. Chan, T. Wong","doi":"10.1109/VLSISP.1995.527517","DOIUrl":"https://doi.org/10.1109/VLSISP.1995.527517","url":null,"abstract":"This paper presents the design of a VLSI implementation of a real-time video decoder. The video decoder can decode a motion CIF format video sequence from a data rate of 5 kbyte/frame at 30 frames/sec with a signal-to-noise ratio of 32 dB. It is found that the real-time decoder has a better performance when compared with previous implementations. The static decompression part of the decoder is based on the SDIC algorithm. The major advantage of the SDIC algorithm is the hardware simplicity and its VLSI realization. In this paper, the hardware design of the real-time decoder and results are presented.","PeriodicalId":286121,"journal":{"name":"VLSI Signal Processing, VIII","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132702017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fixed-point optimization utility for C and C++ based digital signal processing programs","authors":"Seehyun Kim, Ki-Il Kum, Wonyong Sung","doi":"10.1109/VLSISP.1995.527491","DOIUrl":"https://doi.org/10.1109/VLSISP.1995.527491","url":null,"abstract":"Two fixed-point optimization utility programs, the range estimator and the fixed-point simulator, are developed for scaling and wordlength determination of digital signal processing algorithms written in C or C++ language. By exploiting the operator overloading characteristics of C++ language, range estimation and fixed-point simulation can be conducted just by modifying the variable declaration of the original floating-point digital signal processing program. Since this utility evaluates the range and the fixed-point performance by simulation, not by analytical methods, it is easily applicable to nearly all type of digital signal processing algorithms including non-linear and time-varying systems. In addition, this utility software can be used for comparing the fixed-point characteristics of different implementation architectures.","PeriodicalId":286121,"journal":{"name":"VLSI Signal Processing, VIII","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132712025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementation and performance of composite fast FIR filtering algorithms","authors":"A. Zergainoh, P. Duhamel","doi":"10.1109/VLSISP.1995.527498","DOIUrl":"https://doi.org/10.1109/VLSISP.1995.527498","url":null,"abstract":"This paper provides a technique to derive a new set of fast Finite Impulse Response (FIR) filtering algorithms founded on the combination of filtering algorithms based on short FFT and short length fast FIR filtering algorithms. Such composite algorithms have the potential to reduce the arithmetic complexity and the characteristic to maintain a low processing delay, independent of the filter length. A methodology for an efficient implementation on the Digital Signal Processor (DSP) of these algorithms is proposed by an optimised structuring and organization of data in memory in order to keep the improvement brought by the reduction of the arithmetic complexity without exceeding the DSP resources such as number of pointers registers and memory. The performance is evaluated in number of machine cycles per point computed. The solution exists to complete the generator code built for the basic algorithms by adding macro-instructions written in a \"DSP\" assembly code.","PeriodicalId":286121,"journal":{"name":"VLSI Signal Processing, VIII","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114598169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A unified cellular array for multiplication, division and square root","authors":"San-Gee Chen, Chieh-Chih Li","doi":"10.1109/VLSISP.1995.527524","DOIUrl":"https://doi.org/10.1109/VLSISP.1995.527524","url":null,"abstract":"A unified fast, small-area processor capable of executing multiplication, division and square-root operations, all starting from MSD is proposed. Unlike the existing designs which require both addition and subtraction operations, and complicated estimator for DIV/SQRT result digits, the proposed design consists of only addition operations and no complicated estimator. By taking negative absolute values of partial remainders, the algorithm breaks the sequential tie between residue sign detection and the next remainder update operations. As such, these two operations can be parallely and independently performed. The proposed architecture has smaller area and more regular structure than the known designs.","PeriodicalId":286121,"journal":{"name":"VLSI Signal Processing, VIII","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127342657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Software solutions for the Viterbi algorithm","authors":"M. Ikekawa, I. Kuroda","doi":"10.1109/VLSISP.1995.527480","DOIUrl":"https://doi.org/10.1109/VLSISP.1995.527480","url":null,"abstract":"Efficient software implementations of the Viterbi algorithm on two new generation processors, /spl mu/PD7701x and V830 are discussed. /spl mu/PD7701x is a 16-bit fixed point general purpose DSP which includes eight 40-bit general purpose registers, highly parallel operation capability, and conditional execution capability. V830 is a 32-bit RISC processor which has a multiply-accumulator and other special instructions for multimedia signal processing. These features enable effective implementations on both processors. The Viterbi decoders for rate 1/2 convolutional code are implemented on these processors and are two times faster than on conventional type DSPs.","PeriodicalId":286121,"journal":{"name":"VLSI Signal Processing, VIII","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115480726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}