{"title":"ARCHVerifyr: An Embedded Software-Driven Approach for Architecture Verification","authors":"T. Grimm, D. Lettnin, M. Hübner","doi":"10.1109/ISVLSI.2018.00049","DOIUrl":"https://doi.org/10.1109/ISVLSI.2018.00049","url":null,"abstract":"The current verification flow of complex systems uses different engines synergistically: virtual prototyping, formal verification, simulation, emulation, and FPGA prototyping. However, none of them is able to verify a complete architecture. On the other hand, hybrid approaches aiming at full verification use techniques that lower the overall complexity by increasing the abstraction level. To bridge this verification gap, we turn to the embedded software and the information it can bring to the verification environment. This work focuses on the semiformal verification of complex systems at the RT level to handle the hardware peculiarities. Our results show an improvement of four times in verification completeness of a complex hardware gateway compared to the commercial tool.","PeriodicalId":114330,"journal":{"name":"2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"192 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114967874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nishchay H. Sule, T. Powell, S. Hemmady, P. Zarkesh-Ha
{"title":"Predicting the Tolerance of Extreme Electromagnetic Interference on MOSFETs","authors":"Nishchay H. Sule, T. Powell, S. Hemmady, P. Zarkesh-Ha","doi":"10.1109/ISVLSI.2018.00114","DOIUrl":"https://doi.org/10.1109/ISVLSI.2018.00114","url":null,"abstract":"Extreme Electromagnetic Interference (EEMI) can cause device malfunction due to reparable upsets before any permanent hardware damage occurs to electronic devices. In this paper, a predictive model is developed to characterize the impact of EEMI on Metal-Oxide Semiconductor Field-Effect Transistors (MOSFETs) prior to any such permanent damage. The predictive model determines the onset of tolerance limits of EEMI on the Ion/Ioff ratio of a MOSFET for a given technology node, using only the most fundamental device parameters - such as the threshold voltage and power supply. The developed model is successfully compared against measurement data from a device fabricated using 350nm standard CMOS process through TSMC. Based on the predictive model the tolerance of the EEMI injected power in a MOSFET reduces due to technology scaling, starting from 9.7dBm at 350nm, and down to -1.7dBm at 65nm technology node.","PeriodicalId":114330,"journal":{"name":"2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133219106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Optimized Architecture For Decomposed Convolutional Neural Networks","authors":"Fangxuan Sun, Jun Lin, Zhongfeng Wang","doi":"10.1109/ISVLSI.2018.00100","DOIUrl":"https://doi.org/10.1109/ISVLSI.2018.00100","url":null,"abstract":"Convolutional neural networks (CNNs) have found extensive applications in various tasks. However, the state-of-the-art CNNs are both computation-intensive and memory-intensive, which brings tremendous hardware implementation challenges. Various methods have been proposed to reduce the model size and computation complexity of a CNN. Among them, when hardware implementation is considered, the Canonical Polyadic decomposition (CPD) method is more suitable due to the regularity in the decomposed filters. Moreover, the CPD method can be combined with widely used pruning methods to compress the model in further. In this paper, to the best of our knowledge, an efficient hardware architecture for CPD-CNNs is proposed for the first time based on a carefully designed data flow. In detail, a reconfigurable fast convolution unit is introduced to reduce the number of multiplications while handling some commonly-used convolution core operations. The proposed architecture is coded with RTL and synthesized under the TSMC 90nm CMOS technology. Our design achieves an equivalent throughput of more than 3TOP/s under 650MHz clock frequency.","PeriodicalId":114330,"journal":{"name":"2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121878260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Area Efficient NMOS Based Positive and Negative Voltage Multiplier","authors":"V. Rana","doi":"10.1109/ISVLSI.2018.00013","DOIUrl":"https://doi.org/10.1109/ISVLSI.2018.00013","url":null,"abstract":"This paper presents NMOS based voltage multiplier circuit that can be used to generate both high positive and negative voltages from single charge-pump circuit. Basic, voltage multiplier unit consists of two phase clock signals, charge transfer NMOS transistors and bootstrapped configuration to boost the gate drive of NMOS transistors. Due to use of only NMOS transistors, output resistance of circuit is lower than conventional PMOS based circuits thus able to drive high load current. Electrical conditions of all devices used in the circuit is managed in such a way that there is no electrical stress across any device used in design. Circuit is design and implemented in BCD-110nm technology using conventional (No DMOS) transistors.","PeriodicalId":114330,"journal":{"name":"2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"187 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117055487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Patooghy, Ehsan Aerabi, Hamidreza Rezaei, Miguel Mark, M. Fazeli, M. Kinsy
{"title":"Mystic: Mystifying IP Cores Using an Always-ON FSM Obfuscation Method","authors":"A. Patooghy, Ehsan Aerabi, Hamidreza Rezaei, Miguel Mark, M. Fazeli, M. Kinsy","doi":"10.1109/ISVLSI.2018.00119","DOIUrl":"https://doi.org/10.1109/ISVLSI.2018.00119","url":null,"abstract":"The separation of manufacturing and design processes in the integrated circuit industry to tackle the ever increasing circuit complexity and time to market issues has brought with it some major security challenges. Chief among them is IP piracy by untrusted parties. Hardware obfuscation which locks the functionality and modifies the structure of an IP core to protect it from malicious modifications or piracy has been proposed as a solution. In this paper, we develop an efficient hardware obfuscation method, called Mystic (Mystifying IP Cores), to protect IP cores from reverse engineering, IP overproduction, and IP piracy. The key idea behind Mystic is to add additional state transitions to the original/functional FSM (Finite State Machine) that are taken only when incorrect keys are applied to the circuit. Using the proposed Mystic obfuscation approach, the underlying functionality of the IP core is locked and normal FSM transitions are only available to authorized chip users. The synthesis results of ITC99 circuit benchmarks for ASIC 45nm technology reveal that the Mystic protection method imposes on average 5.14% area overhead, 5.21% delay overhead, and 8.06% power consumption overheads while it exponentially lowers the probability that an unauthorized user will gain access to or derive the chip functionality.","PeriodicalId":114330,"journal":{"name":"2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132560812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On How to Efficiently Implement Deep Learning Algorithms on PYNQ Platform","authors":"Luca Stornaiuolo, M. Santambrogio, D. Sciuto","doi":"10.1109/ISVLSI.2018.00112","DOIUrl":"https://doi.org/10.1109/ISVLSI.2018.00112","url":null,"abstract":"Deep Learning algorithms are gaining momentum as main components in a large number of fields, from computer vision and robotics to finance and biotechnology. At the same time, the use of Field Programmable Gate Arrays (FPGAs) for data-intensive applications is increasingly widespread thanks to the possibility to customize hardware accelerators and achieve high-performance implementations with low energy consumption. Moreover, FPGAs have demonstrated to be a viable alternative to GPUs in embedded systems applications, where the benefits of the reconfigurability properties make the system more robust, capable to face the system failures and to respect the constraints of the embedded devices. In this work, we present a framework to efficiently implement Deep Learning algorithms by exploiting the PYNQ platform, recently released by Xilinx. The case study application is tested on PYNQ-Z1 board, commonly used in embedded system applications.","PeriodicalId":114330,"journal":{"name":"2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"95 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114103245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kalindu Herath, Alok Prakash, Udaree Kanewala, T. Srikanthan
{"title":"Communication-Aware Module Placement for Power Reduction in Large FPGA Designs","authors":"Kalindu Herath, Alok Prakash, Udaree Kanewala, T. Srikanthan","doi":"10.1109/ISVLSI.2018.00047","DOIUrl":"https://doi.org/10.1109/ISVLSI.2018.00047","url":null,"abstract":"Modern multi-million logic FPGAs allow hardware designers to map increasingly large designs into FPGAs. However, traditional FPGA CAD flows scale poorly for large designs, often producing low quality solutions in terms of performance and power in such cases. To improve the design productivity, modular design methodology partitions a large design into subsystems, compiles them individually and finally collates the individual solutions to complete the mapping process. Existing work has attempted to partition large designs into smaller subsystems, based on the intra-subsystem communication frequencies, to reduce routing power dissipation. However, inter-subsystem communication has not been considered, especially, during the placement stage. In this paper, we first show the adverse effect of ignoring the inter-subsystem communication during the placement stage. Next, we propose an inter-subsystem communication-aware placement technique using a Simulated Annealing based approach to achieve significant power savings. Experimental results show over 7% reduction in routing power when compared to the existing state-of-the-art partitioning flow that does not consider inter-subsystem communication, while the routing power reduction is over 11% when compared to a commercial CAD tool such as Altera Quartus.","PeriodicalId":114330,"journal":{"name":"2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114761559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High Bandwidth Off-Chip Memory Access Through Hybrid Switching and Inter-Chip Wireless Links","authors":"G. Harsha, H. Mondal, Sujay Deb","doi":"10.1109/ISVLSI.2018.00028","DOIUrl":"https://doi.org/10.1109/ISVLSI.2018.00028","url":null,"abstract":"Off-chip memory performance in many core processors has remained unscaled due to limited pin bandwidth, number of memory controllers and interconnect limitations. It is one of the major bottlenecks for achieving high performance in many core processors, especially with increasing bandwidth requirements as more cores are integrated on a single chip. To achieve high bandwidth memory access, we propose an interconnection architecture with (i) off-chip wireless links for main memory access and (ii) hybrid switching with packet and circuit switching in on-chip mesh network. The off-chip wireless links are designed to provide high data and low energy access to off-chip memory. We enhance the intra-chip network by establishing circuit switch links between caches and memory controllers to provide low latency access, while inter-core communication is achieved through packet switching. The performance evaluation of the proposed architectures shows that they improve performance by 31.09% in runtime and 64.76% in memory access latency as compared to baseline, while consuming 56.57% less energy.","PeriodicalId":114330,"journal":{"name":"2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123876422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design-Based Fingerprinting Using Side-Channel Power Analysis for Protection Against IC Piracy","authors":"James Shey, Naghmeh Karimi, R. Robucci, C. Patel","doi":"10.1109/ISVLSI.2018.00117","DOIUrl":"https://doi.org/10.1109/ISVLSI.2018.00117","url":null,"abstract":"Intellectual property (IP) and integrated circuit (IC) piracy are of increasing concern to IP/IC providers because of the globalization of IC design flow and supply chains. Such globalization is driven by the cost associated with the design, fabrication, and testing of integrated circuits and allows avenues for piracy. To protect the designs against IC piracy, we propose a fingerprinting scheme based on side-channel power analysis and machine learning methods. The proposed method distinguishes the ICs which realize a modified netlist, yet same functionality. Our method doesn't imply any hardware overhead. We specifically focus on the ability to detect minimal design variations, as quantified by the number of logic gates changed. Accuracy of the proposed scheme is greater than 96 percent, and typically 99 percent in detecting one or more gate-level netlist changes. Additionally, the effect of temperature has been investigated as part of this work. Results depict 95.4 percent accuracy in detecting the exact number of gate changes when data and classifier use the same temperature, while training with different temperatures results in 33.6 percent accuracy. This shows the effectiveness of building temperature-dependent classifiers from simulations at known operating temperatures.","PeriodicalId":114330,"journal":{"name":"2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121981264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuanqi Chen, Zhifei Wang, Yi-Shing Chang, Jiang Xu, Peng Yang, Zhehui Wang, Luan H. K. Duong
{"title":"A Comprehensive Electro-Optical Model for Silicon Photonic Switches","authors":"Xuanqi Chen, Zhifei Wang, Yi-Shing Chang, Jiang Xu, Peng Yang, Zhehui Wang, Luan H. K. Duong","doi":"10.1109/ISVLSI.2018.00024","DOIUrl":"https://doi.org/10.1109/ISVLSI.2018.00024","url":null,"abstract":"Optical networks are revolutionizing computing systems by improving the energy efficiency, bandwidth, and latency of data movements. Silicon photonic switches, such as microresonator and Mach-Zehnder Interferometer (MZI), are the basic building blocks of optical networks. This work proposes a SPICE-compatible electro-optical co-simulation model, BOSIM, to systematically study silicon photonic linebreak switches using PN, PIN, and MIS capacitor device technologies. BOSIM holistically models both transient and steady-state properties such as switching speed, power, transmission spectrum, area and carrier distribution. BOSIM is validated by the measured data from eight research groups and companies. Compared to microresonator, BOSIM shows that MZI can provide 1.24X performance for a 128-core multiprocessor using photonic network-on-chip but cost 2.20X energy and 7.3X area.","PeriodicalId":114330,"journal":{"name":"2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115145928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}