MICRO 24Pub Date : 1991-09-01DOI: 10.1145/123465.123502
G. Essink, E. Aarts, R. V. Dongen, P. V. Gerwen, J. Korst, K. Vissers
{"title":"Architecture and programming of a VLIW style programmable video signal processor","authors":"G. Essink, E. Aarts, R. V. Dongen, P. V. Gerwen, J. Korst, K. Vissers","doi":"10.1145/123465.123502","DOIUrl":"https://doi.org/10.1145/123465.123502","url":null,"abstract":"The architecture and programming aspects of a programmable video signal processor are discussed. The processor is an integrated circuit that has a modular architecture with a number of programmable, pipelined processing elements. Networks of these processors can be programmed conveniently with the aid of dedicated programming tools. In this paper the emphasis is on the scheduling of video algorithms and the micro code generation for a network of video signal processors. Due to the periodic nature of the video algorithms and the small periods that are involved, successive executions of the video algorithm have to be interleaved in time. We present a novel solution approach to the scheduling problem using phase assignment as the central part. Results of this approach are presented for industrially significant video applications.","PeriodicalId":118572,"journal":{"name":"MICRO 24","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117060782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MICRO 24Pub Date : 1991-09-01DOI: 10.1145/123465.123493
R. Walker, Shivkumar Ramabadran, R. Joshi, Steinar Flatland
{"title":"Increasing user interaction during high-level synthesis","authors":"R. Walker, Shivkumar Ramabadran, R. Joshi, Steinar Flatland","doi":"10.1145/123465.123493","DOIUrl":"https://doi.org/10.1145/123465.123493","url":null,"abstract":"Most high-level synthesis systems act as “black boxes”, with the user allowed minimal interaction during the synthesis process. To allow the user to take a more active role, using his or her creativity where appropriate, advances must be made in three key areas. Better metrics must be developed, to allow the user to quickly evaluate the quality of the design at various points in the synthesis process, and to identify developing problem areas in the design. Better interaction techniques must be devised to allow the user to observe, control, and interact with the synthesis tools. Finally, as a foundation for these metrics and synthesis tools, a well-designed graphical user interface must be developed, this interface must be able to display both small and large designs, and must support mechanisms to control the display complexity. These are three important issues to address in making high-level synthesis a practical design tool.","PeriodicalId":118572,"journal":{"name":"MICRO 24","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130201406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MICRO 24Pub Date : 1991-09-01DOI: 10.1145/123465.123498
M. Nemirovsky, F. Brewer, R. Wood
{"title":"DISC: dynamic instruction stream computer","authors":"M. Nemirovsky, F. Brewer, R. Wood","doi":"10.1145/123465.123498","DOIUrl":"https://doi.org/10.1145/123465.123498","url":null,"abstract":"The Dynamic Instruction Stream Computer is a novel computer architecture which addresses many of the problems present in real-time systems. The DISC operates by allowing multiple instruction streams (ISs), representing different processes to run concurrently by instruction interleaving on the pipeline. Also, the throughput of the DISC can be partitioned in any way between the multiple ISs. Conventional architectures are more concerned with overall performance and throughput than with real-time response. In other words, they optimize the system to the functions that are more heavily used without regard to responsiveness to individual requests. Applications abound where a high degree of responsiveness is required, without too much sacrifice of overall efficiency. This is particularly true in real-time control applications where it is important to optimize the critical loops and respond promptly to interrupts. DISC addresses this problem by dynamically partitioning the processor throughput between multiple instruction streams based upon requirement demands. In this way different tasks and interrupt priorities can be assigned to guarantee their deadlines.","PeriodicalId":118572,"journal":{"name":"MICRO 24","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134313996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MICRO 24Pub Date : 1991-09-01DOI: 10.1145/123465.123501
Haigeng Wang, A. Nicolau, R. Potasman
{"title":"A new technique for induction variable removal","authors":"Haigeng Wang, A. Nicolau, R. Potasman","doi":"10.1145/123465.123501","DOIUrl":"https://doi.org/10.1145/123465.123501","url":null,"abstract":"Removing redundant loop induction variables(IV’s) in a sequential program can improve the code performance by making effective use of registers and reducing the dynamic instruction count in the loop. At the microcode level and in high-performance, fine-grain parallel architectures, it is even more important that a parallelizing compiler is able to remove redundant IV’s generated as a by-product of parallelizing transformations. Conventional IV detection algorithm fails in finding an IV family with no basic IV. Copy propagation in general cannot transform an IV family with no basic IV into a family with a basic IV. As a result, conventional IV removal method would not work for more general types of IV families, which often result from loop parallelizing transformations and also exist in sequential programs. Furthermore, IV removal by copy propagation with loop unrolling cannot preserve the semantic of the original code in addition to its space-inefficiency. We present in this paper a new technique for redundant IV removal. It can remove redundant IV’s from more general types of IV families without an overhead of code size increase, which is inevitably incurred by other methods such aa loop unwinding and copy propagation with node splitting . It can also be used to determine whether redundant IV’s should be removed(i.e., benefits the overall performance). We then demonstrate the effectiveness of this technique using some benchmarks. Pcrmisston to copy without fee all or part of this material is granted pro. vlded that the copies are not made or distributed for direct commerc]a 1 advantage, the ACM copyrtght notms and the tMe of the pubhcation and m date appear, and notice is given that copying is by permission of the Association for Computing Machinety. To copy othetwise, or to repubhsh,requm?s a fee andlor specl!ic permission. O 1991 ACM 0-89791-460-0/91/0011/0172 $1.50 *This work is supported h part by NSF grant CCRS704367 and ONR graut NOO014S6K0215 .","PeriodicalId":118572,"journal":{"name":"MICRO 24","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123219844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MICRO 24Pub Date : 1991-09-01DOI: 10.1145/123465.123476
M. Farrens, A. Park
{"title":"Workload and implementation considerations for dynamic base register caching","authors":"M. Farrens, A. Park","doi":"10.1145/123465.123476","DOIUrl":"https://doi.org/10.1145/123465.123476","url":null,"abstract":"Dynamic Base Register Caching (DBRC) [. Farrens Park Compression 1990 .] [. Farrens Park SIGARCH18 1991 .] has been shown to be a useful technique for significantly reducing processor to memory address bandwidth. By caching the higher order portions of memory addresses in a set of dynamically allocated base registers, only small register indices need to be transmitted between the processor and memory instead of the high order address bits themselves. In this paper we present the results of trace driven simulations which indicate that DRBC can facilitate the provision of separate paths for instructions and data by reducing the number of address lines required for parallel address channels. In fact, tailoring DBRC for separate instruction and data streams results in superior address compression. We also show that the effectiveness of DBRC is not significantly degraded by multiprogramming workload, for large Spec benchmark traces. Additionally, we suggest two methods to optimize DBRC implementation. (1) A processor’s translation lookaside buffer hardware can be modified to implement DBRC in addition to its normal address translation functions. (2) DBRC latency can be hidden by properly synchronizing it with memory chip address pin multiplexing.","PeriodicalId":118572,"journal":{"name":"MICRO 24","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131674068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MICRO 24Pub Date : 1991-09-01DOI: 10.1145/123465.123490
R. Karri, A. Orailoglu
{"title":"ALPS: an algorithm for pipeline data path synthesis","authors":"R. Karri, A. Orailoglu","doi":"10.1145/123465.123490","DOIUrl":"https://doi.org/10.1145/123465.123490","url":null,"abstract":"While techniques for design of high performance computing systems have been well understood, software mechanisms for the automatic design of high performance application specific integrated circuits (ASICS) remain relatively u nexplored. Advances in levels of integration will make it feasible to support performance-enhancing structures on a single chip. With the increasing demand for high performance in real-time signal processing applications, the design of high speed ASICS merits immediate attention. In this paper, we develop software mechanisms for the high-level synthesis of high-performance VLSI systems. We have extended our interactive behavioral synthesis framework that provides scheduling with multiple constraints including performance and cost, to support scheduling for high-performance. The system is powerful enough to allow trade-offs along mnltiple dimensions. The software mechanisms to support highperformance include a pipeline scheduler, ALPS, that suppol ts constraints including performance and cost. ALPS is a polynomial time algorithm. Experimental results have shown that (a) ALPS consistently synthesizes designs on the optimal-designs curve, (b) it can be used for rapid prototypiug as well as for detailed synthesis, and (c) the interplay between performance and cost results in a rich set of design alternatives.","PeriodicalId":118572,"journal":{"name":"MICRO 24","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132291610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}