{"title":"Channel segmentation design for symmetrical FPGAs","authors":"Wai-Kei Mak, D. F. Wong","doi":"10.1109/ICCD.1997.628914","DOIUrl":"https://doi.org/10.1109/ICCD.1997.628914","url":null,"abstract":"The channel segmentation design problem for symmetrical FPGAs is the problem of designing segmented tracks in the interconnection channels that provides good net routability and delay performance at the same time. In this paper, we show how to separate the problem into the segmentation design problems of the vertical and horizontal channels by a statistical analysis of the net distribution on a symmetrical FPGA. And we propose an effective approach for segmented channel design when the allowed number of tracks in a channel is fixed and limited.","PeriodicalId":154864,"journal":{"name":"Proceedings International Conference on Computer Design VLSI in Computers and Processors","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127641432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On complexity reduction of FIR digital filters using constrained least squares solution","authors":"K. Muhammad, K. Roy","doi":"10.1109/ICCD.1997.628868","DOIUrl":"https://doi.org/10.1109/ICCD.1997.628868","url":null,"abstract":"We apply constrained least squares solution (CLS) to the problem of reducing the number of operations in FIR digital filters with a motivation of reducing its power consumption. The constraints are defined by the maximum allowable add/subtract operations in forming the products which are used in computing the output. We show that truncation and rounding of coefficients can be viewed as power constrained least squares (PCLS) solutions. Further, we show that in dedicated DSP processor based architectures it is possible to reduce power by using PCLS coefficients along with appropriately modified multipliers. It is also shown that the Booth multiplier effectively reduces the complexity of such filters, thereby increasing power savings. Finally, we show that typically 80% to 45% reduction in number of operations can be obtained for systems employing uncoded and Booth recoded multipliers, respectively.","PeriodicalId":154864,"journal":{"name":"Proceedings International Conference on Computer Design VLSI in Computers and Processors","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132281195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Practical advances in asynchronous design","authors":"E. Brunvand, S. Nowick, K. Yun","doi":"10.1109/ICCD.1997.628936","DOIUrl":"https://doi.org/10.1109/ICCD.1997.628936","url":null,"abstract":"Recent practical advances in asynchronous circuit and system design have resulted in renewed interest by circuit designers. Asynchronous systems are being viewed as an increasingly viable alternative to globally synchronous system organization. This tutorial will present the current state of the art in asynchronous circuit and system design in three different areas. The first section details asynchronous control systems. The second describes a variety of approaches to asynchronous datapaths. The third section is on asynchronous and self-timed circuits applied to the design of general purpose processors.","PeriodicalId":154864,"journal":{"name":"Proceedings International Conference on Computer Design VLSI in Computers and Processors","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131279250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intertwined development and formal verification of a 60/spl times/ bus model","authors":"Matt Kaufmann, C. Pixley","doi":"10.1109/ICCD.1997.628845","DOIUrl":"https://doi.org/10.1109/ICCD.1997.628845","url":null,"abstract":"We describe a project in which the IBM/Motorola 60/spl times/ bus protocol was incrementally modeled at an abstract level in Verilog and verified using Motorola's Verdict model checker. The primary purpose of the modeling activity was to acquaint verification personnel with details of the 60/spl times/ bus protocol and to document specific properties of the 60/spl times/ bus that are necessary to guarantee compliance with hand-written protocol documentation. Our Verilog 60/spl times/ bus model documents the 60/spl times/ bus protocol for other Motorola business units.","PeriodicalId":154864,"journal":{"name":"Proceedings International Conference on Computer Design VLSI in Computers and Processors","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132101864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced compression techniques to simplify program decompression and execution","authors":"M. Breternitz, Roger Smith","doi":"10.1109/ICCD.1997.628865","DOIUrl":"https://doi.org/10.1109/ICCD.1997.628865","url":null,"abstract":"Compressing instruction sequences can reduce the cost of embedded systems by reducing program ROM-size requirements. Compression also facilitates the use of RISC core architectures, like the PowerPC/sup TM/ architecture, in embedded systems. Compression techniques are presented which enable decompression and execution of compressed code to occur without the need of a lookaside table (LAT) or cache lookaside buffer (CLB). These techniques successfully merge code modification and compression into a single software preprocessing step. Decompression and execution of compressed code are made very simple. An application of these techniques to about 120000 instructions of PowerPC firmware code is described.","PeriodicalId":154864,"journal":{"name":"Proceedings International Conference on Computer Design VLSI in Computers and Processors","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133411437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal clock period clustering for sequential circuits with retiming","authors":"P. Pan, Arvind K. Karandikar, C. Liu","doi":"10.1109/ICCD.1997.628858","DOIUrl":"https://doi.org/10.1109/ICCD.1997.628858","url":null,"abstract":"We consider the problem of clustering sequential circuits subject to a bound on the area of each cluster, with the objective of minimizing clock period. Current algorithms address combinational circuits only, and treat a sequential circuit as a special case, by removing the flip-flops (FFs) and clustering the remaining combinational logic. This approach segments a circuit and assumes the positions of the FFs are fixed. The positions of FFs are in fact dynamic, because of retiming. As a result, current algorithms can only consider a small portion of the available solution space. In this paper, we present a clustering algorithm that does not remove the FFs. It also considers the effect of retiming. The algorithm can produce clustering solutions with optimal clock periods under the unit delay model. For the general delay model, it can produce clustering solutions with clock periods provably close to minimum.","PeriodicalId":154864,"journal":{"name":"Proceedings International Conference on Computer Design VLSI in Computers and Processors","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131851574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Synthesis of delay verifiable sequential circuits using partial enhanced scan","authors":"R. Tekumalla, P. R. Menon","doi":"10.1109/ICCD.1997.628934","DOIUrl":"https://doi.org/10.1109/ICCD.1997.628934","url":null,"abstract":"The path delay fault testability of sequential circuits is limited by state transitions that can be produced during normal operation. As a result, there may be untestable faults some of which may affect circuit behavior. The authors first extend the concept of primitive faults to sequential circuits. They then describe a method of selecting a set of flip-flops for partial enhanced-scan, such that falling transitions on all paths are made robustly testable in a two-level prime and irredundant realization of the sequential circuit. It results in a robust or VNR test for every rising transition primitive fault, using the available state transitions. A method of synthesizing sequential circuits such that untestable faults do not affect the initialization, is presented. An area comparison between area-optimized and delay-verifiable versions of the MCNC '91 benchmark circuits is also presented.","PeriodicalId":154864,"journal":{"name":"Proceedings International Conference on Computer Design VLSI in Computers and Processors","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115232401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vector restoration based static compaction of test sequences for synchronous sequential circuits","authors":"I. Pomeranz, S. Reddy","doi":"10.1109/ICCD.1997.628895","DOIUrl":"https://doi.org/10.1109/ICCD.1997.628895","url":null,"abstract":"The authors propose a new procedure for static compaction that belongs to the class of procedures that omit test vectors from a given test sequence in order to reduce its size without reducing the fault coverage. The previous procedures that achieved high levels of compaction using this technique attempted to omit test vectors from a given test sequence one at a time or in consecutive subsequences. Consequently, the omission of each vector or subsequence required extensive simulation to determine the effects of each vector omission on the fault coverage. The proposed procedure first omits (almost) all the test vectors from the sequence, and then restores some of them as necessary to achieve the required fault coverage. The decision to restore a vector requires simulation of a single fault. Thus, the overall computational effort of this procedure is significantly lower. The loss of compaction compared to the scheme that omits the vectors one at a time or in subsequences is small in most cases. Experimental results are presented to support these claims.","PeriodicalId":154864,"journal":{"name":"Proceedings International Conference on Computer Design VLSI in Computers and Processors","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115464702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Built-in temperature sensors for on-line thermal monitoring of microelectronic structures","authors":"Karim Arabi, B. Kaminska","doi":"10.1109/ICCD.1997.628909","DOIUrl":"https://doi.org/10.1109/ICCD.1997.628909","url":null,"abstract":"Built-in temperature sensors increase the system reliability by predicting eventual faults caused by excessive chip temperatures. In this paper, simple and efficient built-in temperature sensors for the on-line thermal monitoring of microelectronic structures are introduced. The proposed temperature sensors produce a signal oscillating at a frequency proportional to the temperature of the microelectronic structure and therefore they are compatible to the oscillation-test method. Design and detailed characteristics of the sensors proposed based on CMOS 1.2 /spl mu/m technology parameters are presented. The fabrication results show a small spread in the nominal oscillation frequency of sensors implemented and a good sensitivity of the oscillation frequency with respect to temperature variations. The sensors proposed require very small power dissipation and silicon area.","PeriodicalId":154864,"journal":{"name":"Proceedings International Conference on Computer Design VLSI in Computers and Processors","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125192179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and implementation of low-power digit-serial multipliers","authors":"Yun-Nan Chang, J. Satyanarayana, K. Parhi","doi":"10.1109/ICCD.1997.628867","DOIUrl":"https://doi.org/10.1109/ICCD.1997.628867","url":null,"abstract":"Digit-serial architectures obtained using traditional unfolding techniques cannot be pipelined beyond a certain level because of the presence of feedback loops. In this paper, a novel design methodology is presented which permits bit-level pipelining of the digit-serial architectures. This achieves sample speeds close to corresponding bit-parallel multipliers with significantly lower area. This increased sample speed can be traded with reduction in power supply voltage resulting in significant reduction in power consumption. The results show that for transformed multipliers with smaller digit-sizes (/spl les/4), the singly-redundant multiplier consumes the least power and for larger digit sizes, the type-I multiplier consumes the least power. It is also found that the optimum digit-size for least power consumption in type-I and type-III multipliers is /spl sim//spl radic/(2 W), where W represents the word-length. The proposed digit-serial multipliers consume on an average 20% lower power than the traditional digit-serial architectures for the non-pipelined case, and about 5-15 times lower power for the bit-level pipelined case. Also, modified Booth (1951) recoding is applied to transformed multipliers and it is found that the recoded multipliers consume about 22% lower power than the transformed multipliers without recoding.","PeriodicalId":154864,"journal":{"name":"Proceedings International Conference on Computer Design VLSI in Computers and Processors","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121946271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}