{"title":"MATRIX: a reconfigurable computing architecture with configurable instruction distribution and deployable resources","authors":"E. Mirsky, A. DeHon","doi":"10.1109/FPGA.1996.564808","DOIUrl":"https://doi.org/10.1109/FPGA.1996.564808","url":null,"abstract":"MATRIX is a novel, coarse-grain, reconfigurable computing architecture which supports configurable instruction distribution. Device resources are allocated to controlling and describing the computation on a per task basis. Application-specific regularity allows us to compress the resources allocated to instruction control and distribution, in many situations yielding more resources for datapaths and computations. The adaptability is made possible by a multi-level configuration scheme, a unified configurable network supporting both datapaths and instruction distribution, and a coarse-grained building block which can serve as an instruction store, a memory element, or a computational element. In a 0.5 /spl mu/ CMOS process, the 8-bit functional unit at the heart of the MATRIX architecture has a footprint of roughly 1.5 mm/spl times/1.2 mm, making single dies with over a hundred function units practical today. At this process point, 100 MHz operation is easily achievable, allowing MATRIX components to deliver on the order of 10 Gop/s (8-bit ops).","PeriodicalId":244873,"journal":{"name":"1996 Proceedings IEEE Symposium on FPGAs for Custom Computing Machines","volume":"2016 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132757381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scheduling and partitioning ANSI-C programs onto multi-FPGA CCM architectures","authors":"J. Peterson, R. O'Connor, P. Athanas","doi":"10.1109/FPGA.1996.564821","DOIUrl":"https://doi.org/10.1109/FPGA.1996.564821","url":null,"abstract":"The increasing size and speed of modern FPGAs allow complex computations, on the order of an average sized program, to be performed in a small collection of processing elements. It is well documented that the execution of large sections of a program within the \"virtual hardware\" offered by an attached FPGA processor can provide substantial speedup over the ordinary execution within a sequential, general-purpose processor. Unfortunately, the development tools currently available for FPGAs do not allow for easily configuring multi-FPGA custom computing machines. Configuration of an FPGA architecture requires scheduling: the mapping of computations onto existing functional units. To take advantage of all available logic, computations may span processing elements, calling for partitioning of a subroutine between one or more FPGAs. In this paper, an architecture-independent design tool is presented for translating programs written in C to a dataflow representation and then efficiently scheduling and partitioning the resulting graphs onto multi-FPGA computing platforms.","PeriodicalId":244873,"journal":{"name":"1996 Proceedings IEEE Symposium on FPGAs for Custom Computing Machines","volume":"182 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133318279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the viability of FPGA-based integrated coprocessors","authors":"Osama T. Albaharna, P. Cheung, T. Clarke","doi":"10.1109/FPGA.1996.564843","DOIUrl":"https://doi.org/10.1109/FPGA.1996.564843","url":null,"abstract":"The paper examines the viability of using integrated programmable logic as a coprocessor to support a host CPU core. This adaptive coprocessor is compared to a VLIW machine in term of both die area occupied and performance. The parametric bounds necessary to justify the adoption of an FPGA-based coprocessor are established. An abstract field programmable gate array model is used to investigate the area and delay characteristics of arithmetic circuits implemented on FPGA architectures to determine the potential speedup of FPGA-based coprocessors. Analysis shows that integrated FPGA arrays are suitable as coprocessor platforms for realising algorithms that require only limited numbers of multiplication instructions. Inherent FPGA characteristics limit the data-path widths that can be supported efficiently for these applications. An FPGA-based adaptive coprocessor requires a large minimum die area before any advantage over a VLIW machine of a comparable size can be realised.","PeriodicalId":244873,"journal":{"name":"1996 Proceedings IEEE Symposium on FPGAs for Custom Computing Machines","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126422116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modelling and optimising run-time reconfigurable systems","authors":"W. Luk, N. Shirazi, P. Cheung","doi":"10.1109/FPGA.1996.564815","DOIUrl":"https://doi.org/10.1109/FPGA.1996.564815","url":null,"abstract":"We present a simple model for specifying and optimising designs which contain elements that can be reconfigured at run-time. In this model the control mechanism for reconfiguration can be implemented in many ways: by the user using multiplexers or other logic blocks, or by FPGAs which support dynamic partial reconfiguration. The model can be used for assessing trade-offs in run-time reconfigurable systems such as operation speed, design size, reconfiguration time and complexity of reconfiguration controllers; current work includes expressing the model in a framework which also captures layout information. Our approach is illustrated by various reconfigurable implementations for filtering and locating edges in images. The design tradeoffs of these implementations are being evaluated on a PCI platform, which contains a Xilinx 6216 device.","PeriodicalId":244873,"journal":{"name":"1996 Proceedings IEEE Symposium on FPGAs for Custom Computing Machines","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114981263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}