{"title":"Binary stochastic implementation of digital logic","authors":"Yanzi Zhu, Peiran Suo, K. Bazargan","doi":"10.1145/2554688.2554778","DOIUrl":"https://doi.org/10.1145/2554688.2554778","url":null,"abstract":"Stochastic computing refers to a mode of computation in which numbers are treated as probabilities implemented as 0/1 bit streams, which essentially is a unary encoding scheme. Previous work has shown significant reduction in area and increase in fault tolerance for low to medium resolution values (6-10 bits). However, this comes at very high latency cost. We propose a novel hybrid approach combining traditional binary with unary stochastic encoding, called binary stochastic. Similar to the binary representation, it is a positional number system, but instead of only 0/1 digits, the digits would be fractions. We show how simple logic such as adders and multipliers can be implemented, and then show more complex function implementations such as the gamma correction function and functions such as tanh, absolute and exponentiation using both combinational and sequential binary stochastic logic. Our experiments show significant reduction in latency compared to unary stochastic, while using significantly smaller area compared to binary implementations on FPGAs.","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":"44 10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130490016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new basic logic structure for data-path computation (abstract only)","authors":"P. Gaillardon, L. Amarù, G. Micheli","doi":"10.1145/2554688.2554701","DOIUrl":"https://doi.org/10.1145/2554688.2554701","url":null,"abstract":"Nowadays, Field Programmable Gate Arrays (FPGA) implement arithmetic functions using specific circuits at the logic block level, such as the carry paths, or at the structure level adopting Digital Signal Processing (DSP) blocks. Nevertheless, all these approaches, introduced to ease the realization of specific functions, are lacking of generality. In this paper, we introduce a new logic block that natively realizes arithmetic functions while preserving the versatility to implement general logic functions. It consists of a partially interconnected matrix of signal routers driven by comparators. We demonstrate that this structure can realize (i) any 2-output 2-input logic function or (ii) any single-output 3-input logic function or (iii) specific logic, such as arithmetic functions, with up to 4-output and 8-inputs. As compared to a standard 6-input Look Up Table (LUT), the proposed block requires roughly the same area but is 35.3% faster. Even though the proposed block has not the same exhaustive configurability of a 6-input LUT, there are arithmetic functions realizable in a single block that do not fit in one, or even more, 6-input LUT. For example, a single block inherently implements an entire 3-bit adder that requires 3× more resources with LUTs plus also custom circuitry. From a system level perspective, we show that a 256-bit adder is implemented with a gain on area×delay product of 31% as compared to its traditional LUT-based counterpart.","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127897732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An automatic netlist and floorplanning approach to improve the MTTR of scrubbing techniques (abstract only)","authors":"Bernhard Schmidt, Daniel Ziener, J. Teich","doi":"10.1145/2554688.2554730","DOIUrl":"https://doi.org/10.1145/2554688.2554730","url":null,"abstract":"We introduce a new SEU mitigation approach which minimizes the scrubbing effort by a) using an automatic classification of the criticality of netlist instances and their resulting configuration bits, and by b) minimizing the number of frames which must be scrubbed by using intelligent floorplanning. The criticality of configuration bits is defined by the actions needed to correct a radiation-induced SEU at this bit. Indeed, circuits that involve feedback loops might still and infinitely cause a malfunction even if scrubbing is applied to involved configuration frames. Here, only supplementary state-restoring might be a viable solution. By analyzing an FPGA design already at the logic level and partition configuration bits of the resulting FPGA mapping into so-called essential bits and critical bits, we are able to significantly reduce the number of time consuming state-restoring actions. Moreover, by using placement and routing constraints, it is shown how to minimize the number of frames which have to be reconfigured or checked when using scrubbing. By applying both methods, we will show a reduction of the Mean-Time-To-Repair (MTTR) for sequential benchmark circuits by up to 48.5% compared to a state-of-the-art approach.","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123259606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy-efficient multiplier-less discrete convolver through probabilistic domain transformation","authors":"Mohammed Alawad, Yu Bai, R. Demara, Mingjie Lin","doi":"10.1145/2554688.2554769","DOIUrl":"https://doi.org/10.1145/2554688.2554769","url":null,"abstract":"Energy efficiency and algorithmic robustness typically are conflicting circuit characteristics, yet with CMOS technology scaling towards 10-nm feature size, both become critical design metrics simultaneously for modern logic circuits. This paper propose a novel computing scheme hinged on probabilistic domain transformation aiming for both low power operation and fault resilience. In such a computing paradigm, algorithm inputs are first encoded through probabilistic means, which translates the input values into a number of random samples. Subsequently, light-weight operations, such as sim- ple additions will be performed onto these random samples in order to generate new random variables. Finally, the resulting random samples will be decoded probabilistically to give the final results.","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114078938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Viswanathan, R. B. Atitallah, J. Dekeyser, Benjamin Nakache, M. Nakache
{"title":"Redefining the role of FPGAs in the next generation avionic systems (abstract only)","authors":"V. Viswanathan, R. B. Atitallah, J. Dekeyser, Benjamin Nakache, M. Nakache","doi":"10.1145/2554688.2554744","DOIUrl":"https://doi.org/10.1145/2554688.2554744","url":null,"abstract":"Embedded reconfigurable computing is becoming a new paradigm for system designers in avionic applications. In fact, FPGAs can be used for more than just computational purpose in order to improve the system performance. The introduction of FPGA Mezzanine Card (FMC) I/O standard has given a new purpose for FPGAs to be used as a communication platform. Taking into account the features offered by FPGAs and FMCs, such as runtime reconfiguration and modularity, we have redefined the role of these devices to be used as a generic communication and computation-centric platform. A new modular, runtime reconfigurable, Intellectual Property (IP)-based communication-centric platform for avionic applications has been designed. This means that, when the communication requirement of an avionic system changes, the necessary communication protocol is installed and executed on demand, without disturbing the normal operation of a time-critical avionic system. The efficiency and the performances of our platform are illustrated through a real industrial use-case designed using a computationally intensive application and several avionic I/O bus standards. The reconfiguration latency can be hidden totally in many cases. While in certain others, the overhead of reconfiguration can be justified by the reduction in the resource utilization.","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116212236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Rethinagiri, Oscar Palomar, A. Cristal, O. Unsal
{"title":"Power estimation tool for system on programmable chip based platforms (abstract only)","authors":"S. Rethinagiri, Oscar Palomar, A. Cristal, O. Unsal","doi":"10.1145/2554688.2554718","DOIUrl":"https://doi.org/10.1145/2554688.2554718","url":null,"abstract":"The ever increasing complexity of the applications result in the development of power hungry processors. There is a scarcity of standalone tools that have a good trade off between estimation speed and accuracy to estimate power/energy at an earlier phase of design flow. There are very few tools that addresses the design space exploration issue based on power and energy. In this paper, we propose a virtual platform based standalone power and energy estimation tool for System-on-Programmable Chip (SoPC) embedded platforms, which is independent of in-house tools. There are two steps involved in this tool development. The first step is power model generation. For the power model development, we used functional parameters to set up generic power models for the different parts of the system. This is a onetime activity. In the second step, a simulation based virtual platform framework is developed to evaluate accurately the activities used in the related power models developed in the first step. The combination of the two steps lead to a hybrid power estimation, which gives a better trade-off between accuracy and speed. The proposed tool has several benefits: it considers the power consumption of the embedded system in its entirety and leads to accurate estimates without a costly and complex material. The proposed tool is also scalable for exploring complex embedded multi-core architectures. The effectiveness of our proposed tool is validated through dualcore RISC processor designed around the FPGA board and extended to accommodate futuristic multi-core processors for a reliable energy based design space exploration. The accuracy of our proposed tool is evaluated by using a variety of industrial benchmarks such as Multimedia, EEMBC and SPEC2006. Estimated power values are compared to real board measurements and also to McPAT. Our obtained power/energy estimation results provide less than 9% of error for heterogeneous MPSoC based system and are 200% faster compared to other state-of-the-art power estimation tools.","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":"09 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116536653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimally mitigating BTI-induced FPGA device aging with discriminative voltage scaling (abstract only)","authors":"Yu Bai, Mohammed Alawad, Mingjie Lin","doi":"10.1145/2554688.2554752","DOIUrl":"https://doi.org/10.1145/2554688.2554752","url":null,"abstract":"With the CMOS technology aggressively scaling towards the 22nm node, modern FPGA devices face tremendous aging- induced reliability challenges due to Bias Temperature In- stability (BTI) and Hot Carrier Injection (HCI). This paper presents a novel antiaging technique at logic level that is both scalable and applicable for VLSI digital circuits implemented with FPGA devices. The key idea is to prolong the lifetime of FPGA-mapped designs by strategically elevating the VDD values of some LUTs based on their modular criticality values. Although the idea of scaling VDD in order to improve either energy efficiency or circuit reliability has been explored extensively, our study distinguishes itself by approaching this challenge through analytical procedure, therefore able to maximize the overall reliability of target FPGA design by rigorously modelling the BTI-induce de- vice reliability and optimally solving the VDD assignment problem.","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134192810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Future inter-FPGA communication architecture for multi-FPGA based prototyping (abstract only)","authors":"Qingshan Tang, M. Tuna, H. Mehrez","doi":"10.1145/2554688.2554747","DOIUrl":"https://doi.org/10.1145/2554688.2554747","url":null,"abstract":"Multi-FPGA boards are widely used for rapid system prototyping. Even though the prototyping is trying to reach the maximum performance, the performance is limited by the inter-FPGA communication. As the capacity per I/O for each FPGA generation is increasing, FPGA I/Os are becoming a scarce resource. The design is divided into several parts, each part's capacity fits in a single FPGA. Signals crossing design's parts located in different FPGAs are called cut nets. In order to resolve pin limitation problem, cut nets are sent between FPGAs in pipelined way using the Time-Division-Multiplexing technique. The maximum number of cut nets passing through one FPGA I/O is called the TDM ratio. There are two multiplexing architectures used for multi-FPGA based prototyping: Logic Multiplexing and ISERDES/OSERDES. In this paper, a new multiplexing architecture Multi-Gigabit Transceiver (MGT) is proposed. Experiments are done in a multi-FPGA board with the testbench LFSR to validate the achieved performance. Assume that all the FPGA I/Os used for inter-FPGA communication are MGT capable in the future. Analyses show that the proposed multiplexing architecture can achieve higher performance when the TDM ratio exceeds 67. The gain in performance of the proposed architecture over the existing architecture augments as the TDM ratio increases.","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133748362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chao Wang, Xi Li, Xuehai Zhou, Yunji Chen, K. Bertels
{"title":"Co-processing with dynamic reconfiguration on heterogeneous MPSoC: practices and design tradeoffs (abstract only)","authors":"Chao Wang, Xi Li, Xuehai Zhou, Yunji Chen, K. Bertels","doi":"10.1145/2554688.2554695","DOIUrl":"https://doi.org/10.1145/2554688.2554695","url":null,"abstract":"Reconfiguration technique has been considered as one of the most promising electronic design automation (EDA) technologies in MPSoC design paradigms. However, due to the unavoidable latency in the reconfiguration procedure, it still poses a significant challenge to efficiently analyze the trade-offs for the software/hardware execution, static reconfiguration and dynamic reconfiguration. In this paper we first present a heterogeneous MPSoC middleware to support state-of-the-art dynamic partial reconfigurable technologies. Furthermore, we evaluate the reconfiguration latency and analyze the trade-off for the dynamic partial reconfiguration technologies. As a practical study, a heterogeneous MPSoC prototype with JPEG application has been developed on Xilinx Zynq FPGA with state-of-the-art static/dynamic partial reconfigurable technologies. Experimental results on the JPEG case studies demonstrated the leverage among the software execution, hardware execution, and static/dynamic reconfiguration. For the quantitative approach, we have demonstrated the execution time for the different configuration of the hardware steps in JPEG, and the quantitative impact of the dynamic reconfiguration execution. The dynamic reconfiguration could gain the performance benefits for large scale (larger than a certain threshold) computational tasks. Furthermore, overheads and HWICAP hardware utilization have been measured discussed. This work was supported by the NSFC grants No. 61379040, No. 61272131 and No. 61202053.","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116161565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Tools and methods","authors":"J. Anderson","doi":"10.1145/3260938","DOIUrl":"https://doi.org/10.1145/3260938","url":null,"abstract":"","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123799353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}