{"title":"A variable long-precision arithmetic unit design for reconfigurable coprocessor architectures","authors":"A. Tenca, M. Ercegovac","doi":"10.1109/FPGA.1998.707899","DOIUrl":"https://doi.org/10.1109/FPGA.1998.707899","url":null,"abstract":"This paper presents the organization of an arithmetic unit for variable long-precision (VLP) operands suitable for reconfigurable computing. The reconfigurable arithmetic coprocessor (RAC) cooperates with the host computer in the VLP tasks. The main design issues addressed in the paper are: (a) mapping of the most frequent and time consuming operations of the VLP arithmetic algorithms to RAG, and (b) design of VLP algorithms that allow reduced reconfiguration time between arithmetic operations. The VLP arithmetic algorithms proposed cover multiplication, division and square root. In this paper we present the main building blocks used in the VLP arithmetic circuits, show the similarities of each arithmetic operator and present area/time estimates of these circuits in Xilinx FPGAs.","PeriodicalId":309841,"journal":{"name":"Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114851341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Weiß, Ronny Kistner, A. Kunzmann, W. Rosenstiel
{"title":"Analysis of the XC6000 architecture for embedded system design","authors":"K. Weiß, Ronny Kistner, A. Kunzmann, W. Rosenstiel","doi":"10.1109/FPGA.1998.707902","DOIUrl":"https://doi.org/10.1109/FPGA.1998.707902","url":null,"abstract":"Novel FPGA architectures combined with new implementation methods influence the widespread area of embedded system design. Current research results based on the Global Run Time Reconfiguration (RTR) method show improvements to the functional density of up to 500% in contrast to the widely used Compile Time Reconfiguration (CTR) method. In addition, the RTR method applied to a partial reconfigurable architecture, like Xilinx XC6000, promises further enhancements. This paper analyses different implementation methods in order to exploit the resources of specific FPGA architectures. The development of an ATM diagnostic monitor serves as a realistic application to analyse these methods. We will analyse the differences between the XC4000E/EX and XC6000 FPGA architecture, based on their use of the same CTR implementation method. Additionally: applying Local RTR to XC6000 leads to a further benefit of about 20% in contrast to the CTR method. The evaluation process is based on the FZI internal rapid prototyping environment, which is best suited for sophisticated ATM applications.","PeriodicalId":309841,"journal":{"name":"Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251)","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124751701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A reconfigurable multiplier array for video image processing tasks, suitable for embedding in an FPGA structure","authors":"S. D. Haynes, P. Cheung","doi":"10.1109/FPGA.1998.707900","DOIUrl":"https://doi.org/10.1109/FPGA.1998.707900","url":null,"abstract":"This paper presents a design for a reconfigurable multiplier array. The multiplier is constructed using an array of 4 bit Flexible Array Blocks (FABs), which could be embedded within a conventional FPGA structure. The array can be configured to perform a number of 4n/spl times/4m bit signed/unsigned binary multiplications. We have estimated that the FABs are about 25 times more efficient in area than the equivalent multiplier implemented using a conventional FPGA structure alone.","PeriodicalId":309841,"journal":{"name":"Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132122939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementation of RNS addition and RNS multiplication into FPGAs","authors":"L. Maltar, F. França, V. Alves, C. L. Amorim","doi":"10.1109/FPGA.1998.707940","DOIUrl":"https://doi.org/10.1109/FPGA.1998.707940","url":null,"abstract":"We investigate whether arithmetic operations based on Residue Number Systems (RNS) are cost-effective solutions to implement DSP applications into reconfigurable hardware. We simulated several RNS addition and multiplication implementations by varying the RNS parameters. For RNS addition, our results show that it can be implemented into a 3-stage 80.6-92.5 MHz pipeline using about 22 to 33 FPGAs' logic cells. For RNS multiplication, the attainable speed range was between 78.1 and 87.7 MHz, for operand lengths varying between 5 and 8 bits. Overall, a hybrid solution that combines logical elements and blocks of RAM is the best option, producing better average performance across the whole range of operand lengths.","PeriodicalId":309841,"journal":{"name":"Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126678602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Motomura, Y. Aimoto, A. Shibayama, Y. Yabe, M. Yamashina
{"title":"An embedded DRAM-FPGA chip with instantaneous logic reconfiguration","authors":"M. Motomura, Y. Aimoto, A. Shibayama, Y. Yabe, M. Yamashina","doi":"10.1109/FPGA.1998.707909","DOIUrl":"https://doi.org/10.1109/FPGA.1998.707909","url":null,"abstract":"Reconfigurable computing is attracting wide attention as a novel general purpose computing paradigm for accelerating compute intensive and/or data-parallel applications, such as compression, encryption, searching, sorting, and image processing. A key enabling technology for a reconfigurable computer is in-system logic reconfiguration of SRAM-based FPGAs, through which its hardware architecture is dynamically customized for a specific task on demand. Quicker a reconfiguration is, more frequent the reconfigurations can become: i.e., a reconfigurable computer can adapt to applications which have more dynamic behavior. A whole-chip reconfiguration in conventional FPGAs, however, takes at least 100/spl mu/s. With this long latency, a reconfigurable computer is adaptable only to static applications, substantially losing the general-purposeness of the original concept. Integrating a DRAM with an FPGA can become an ideal solution to this problem. The on-chip DRAM can store hundreds of configuration programs, and the logic reconfiguration can get extremely faster by context-switching among the programs utilizing huge bandwidth internal to the DRAM core. Being driven by this observation, we have conducted prototype design of an embedded DRAM-FPGA chip.","PeriodicalId":309841,"journal":{"name":"Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116909571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}