{"title":"Physical Design Variation in Relative Timed Asynchronous Circuits","authors":"Tannu Sharma, K. Stevens","doi":"10.1109/ISVLSI.2017.56","DOIUrl":"https://doi.org/10.1109/ISVLSI.2017.56","url":null,"abstract":"Variations in integrated circuits stem from multiple sources. This paper studies variations in placement and delay that occur when using commercial EDA in a relatively unsupported fashion – to implement large unclocked circuits. A tool suite is built to study placed and routed designs. Significant variations in physical placement is shown, leading to degradation in performance, power efficiency, and robustness. An experimental method of mitigating timing and placement variation using relative place directives is applied, resulting in circuits that are 7% faster and 4% lower power.","PeriodicalId":187936,"journal":{"name":"2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123830475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-Performance and Energy-Efficient 256-Bit CMOS Priority Encoder","authors":"D. Balobas, Nikos Konofaos","doi":"10.1109/ISVLSI.2017.30","DOIUrl":"https://doi.org/10.1109/ISVLSI.2017.30","url":null,"abstract":"A high-performance and energy-efficient 256-bit CMOS priority encoder is presented and realized on transistor level using 32 nm predictive technology. The new circuit is designed with a full custom approach and incorporates 2 novel logic styles: the Multiple-Output Monotonic CMOS (M2CMOS) and the Dynamic Inversion technique (DI). The achieved performance is in the order of O(log2(N)), with respect to the input size. A simulation-based comparative analysis concludes that, compared to the conventional design, the proposed circuit achieves up to 57% improvement in delay, 8% improvement in energy consumption and 39% improvement in EDP, while maintaining 20% smaller transistor count.","PeriodicalId":187936,"journal":{"name":"2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129934082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marco Rabozzi, Rolando Brondolin, Giuseppe Natale, Emanuele Del Sozzo, M. Hübner, A. Brokalakis, C. Ciobanu, D. Stroobandt, M. Santambrogio
{"title":"A CAD Open Platform for High Performance Reconfigurable Systems in the EXTRA Project","authors":"Marco Rabozzi, Rolando Brondolin, Giuseppe Natale, Emanuele Del Sozzo, M. Hübner, A. Brokalakis, C. Ciobanu, D. Stroobandt, M. Santambrogio","doi":"10.1109/ISVLSI.2017.71","DOIUrl":"https://doi.org/10.1109/ISVLSI.2017.71","url":null,"abstract":"As the power wall has become one of the main limiting factors for the performance of general purpose processors, the trend in High Performance Computing (HPC) is moving towards application-specific accelerators in order to meet the stringent performance requirements for exascale computing while still satisfying power budget constraints. Within this context, reconfigurable devices, and more specifically FPGA-based systems, represent a promising solution able to achieve highly energy efficient computations without jeopardizing performance. Nevertheless, the exploitation of reconfigurable hardware is still limited due to the hardware-software co-design challenges that it poses, the time consuming design space exploration process and the programming complexity. To overcome these challenges, the EXTRA European project addresses the reconfigurability of such devices as a first-class feature, covering the entire stack from the system architecture up to the application. Within this paper, we present the effort of the EXTRA project towards the definition of an adaptive open platform for the optimization and implementation of applications on high performance reconfigurable architectures. The underlying infrastructure of the platform is here presented, highlighting its capability to integrate modules from different developers in order to stimulate external contributions and open research.","PeriodicalId":187936,"journal":{"name":"2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127124030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Novel Pulsed-Latch Replacement in Non-Volatile Flip-Flop Core","authors":"Hao Cai, You Wang, L. Naviner, Weisheng Zhao","doi":"10.1109/ISVLSI.2017.19","DOIUrl":"https://doi.org/10.1109/ISVLSI.2017.19","url":null,"abstract":"In this paper, we propose efficient scalable nonvolatile flip-flops (NV-FF) with single-stage pulsed latch which is explored as the flip-flop core in hybrid CMOS/MTJ (magnetic tunnel junction) integration. Typical full-custom FF cores are implemented with a 28nm ultra-thin body and buried oxide (UTBB) fully depleted silicon-on-insulator (FD-SOI) technology. The performance analysis takes into account sensing delay, dynamic power, leakage power and process variations. Results show that the transmission gate pulsed latch (TGPL) based NVFF exhibits enhanced performance compared to conventional master-slave structure, with improved variability, 15.7% fast timing metric, 76% dynamic, 79% leakage power reduction and 30% layout area reduction in multi-bit NV-FF hybrid circuit integration. The pulsed latch FF core can enhance NVFF scalability with increased energy-delay and layout efficiency, as well as reduced active and leakage energy.","PeriodicalId":187936,"journal":{"name":"2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127466666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CAPSL: The Component Authentication Process for Sandboxed Layouts","authors":"Taylor J. L. Whitaker, C. Bobda","doi":"10.1109/ISVLSI.2017.78","DOIUrl":"https://doi.org/10.1109/ISVLSI.2017.78","url":null,"abstract":"In this work, we propose a system-on-chip (SoC) design tool for the automatic generation of hardware sandboxes for securing untrusted IP to be integrated into trusted systems. The Component Authentication Process for Sandboxed Layouts (CAPSL) is a design flow that incorporates behavioral specifications of IP interfaces in order to generate sandboxes purposed for detecting trojan activation and isolating possible damage to a system at run-time. CAPSL adopts two formal models, interface automata and the Property Specification Language's sequential extended regular expressions (SERE), to generate reference monitors governing interactions of a collection of non-trusted IP. The sandbox partitions an untrusted sector that includes the non-secure IP and appropriate virtualized resources and controllers to isolate sandbox-system interactions upon deviation from the behavioral checkers. We review our design flow with an analysis of behavioral policy versatility and detection and defense mechanisms employed for various Trust-Hub.org benchmarks. Also presented is a brief resource evaluation highlighting CAPSL's reduced overhead compared to other run-time verification techniques.","PeriodicalId":187936,"journal":{"name":"2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126294403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Power Efficient System Design Methodology Employing Approximate Arithmetic Units","authors":"Tuba Ayhan, Firat Kula, M. Altun","doi":"10.1109/ISVLSI.2017.50","DOIUrl":"https://doi.org/10.1109/ISVLSI.2017.50","url":null,"abstract":"In this work a power efficient approximate system design methodology is introduced and its performance is demonstrated by a 2D-DCT implementation on Spartan 3 FPGA. The method is applicable to any system with arithmetic computation regardless of their architecture, because it utilizes the existing approximate arithmetic units. The novelty of the proposed method is its system analysis approach starting from the highest level and exploring through the sub-blocks down to the basic arithmetic units. It first evaluates a given system block diagram and sets the desired performance limits of each processing block to achieve the desired ultimate quality metric. Then, the arithmetic power consumption is minimized by employing the appropriate arithmetic units which are chosen by linear/non-linear programming with linear constraint solver. The tests on 2D-DCT implementation show a power reduction of 8% for a 0.01 dB PSNR loss for 128x128 images, on the average.","PeriodicalId":187936,"journal":{"name":"2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129045654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Opamp and Capacitor Sharing 10 Bit 20 MS/s Low Power Pipelined ADC in 0.18µm CMOS Technology","authors":"R. Greeshma, K. AnoopV., B. Venkataramani","doi":"10.1109/ISVLSI.2017.110","DOIUrl":"https://doi.org/10.1109/ISVLSI.2017.110","url":null,"abstract":"In this paper, a novel 10 bit 20 MS/s, low power, pipelined ADC using both capacitor and opamp sharing techniques is proposed. In the proposed ADC, feedback capacitors in the first four stages are shared between adjacent stages in order to decrease the power dissipation of the opamps used in these stages. The opamps in the six pipeline stages are also shared between adjacent stages in pairs for further power reduction. The proposed ADC is designed in UMC 0.18um CMOS technology with a supply voltage of 1.8V and simulated in Cadence Spectre Simulator. The ADC achieves 9.5 bit accuracy with 59.44dB SNR and 70.92dB SFDR and dissipates 4.36mW power. The proposed ADC has better FOM compared to that reported in the literature.","PeriodicalId":187936,"journal":{"name":"2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117248545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combined TDM and SDM Circuit Switching NoCs with Dedicated Connection Allocator","authors":"Yong Chen, E. Matús, G. Fettweis","doi":"10.1109/ISVLSI.2017.27","DOIUrl":"https://doi.org/10.1109/ISVLSI.2017.27","url":null,"abstract":"In general, circuit switching (CS) NoCs suffer from path diversity and resource utilization problem. Combining Time-Division Multiplexing (TDM) and Space-Division-Multiplexing (SDM) CS NoCs can reasonably mitigate this problem by increasing the path diversity and improving sharing of sub-channel among multiple connections. In order to investigate and optimize TDM-SDM partitioning strategy, in this paper, we propose a dedicated connection allocator for combined TDM-SDM CS NoCs based on trellis-search algorithm, which can explore all possible paths between source-destination node pairs within a guaranteed latency. In contrast to the recently published approaches, we propose the novel bidirectional search that starts at source node and destination node simultaneously. Compared to previous unidirectional trellis search algorithms, our algorithm halves the search time while keeping the allocator area almost the same. In addition to this, we studied the influence of different TDM-SDM link partitioning strategies on success rate and path length that allowed us to find the optimal solution. The simulation results show our approach can improve the success rate by 25% to 42% compared to previous connection allocation approaches.","PeriodicalId":187936,"journal":{"name":"2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121569629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Functional Broadside Test Generation Using a Commercial ATPG Tool","authors":"Naixing Wang, Bo Yao, X. Lin, I. Pomeranz","doi":"10.1109/ISVLSI.2017.61","DOIUrl":"https://doi.org/10.1109/ISVLSI.2017.61","url":null,"abstract":"Scan-based tests may lead to overtesting of delay faults by bringing a circuit to states that the circuit cannot enter during functional operation. Functional broadside tests address this issue by using reachable states as scan-in states. Different strategies for generating functional broadside tests have been studied and implemented by academic tools. The main challenge that these procedures address is the identification of reachable states that are useful as scan-in states. This paper describes the generation of functional broadside tests using a commercial test generation tool. Our results demonstrate that it is possible to generate functional broadside tests without requiring any modifications to the commercial tool, and using the tests that the tool produces to obtain reachable states. This is expected to enable the generation of functional broadside tests for state-of-the-art designs that cannot be handled by academic tools. To demonstrate this point, we apply the procedure to two large logic blocks of the OpenSPARC T1 microprocessor.","PeriodicalId":187936,"journal":{"name":"2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124511461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reducing Search Space for Fault Diagnosis: A Probability-Based Scoring Approach","authors":"H. Bidgoli, Payman Behnam, B. Alizadeh, Z. Navabi","doi":"10.1109/ISVLSI.2017.101","DOIUrl":"https://doi.org/10.1109/ISVLSI.2017.101","url":null,"abstract":"Fault diagnosis is one of the most important phases in the VLSI design cycle. This paper proposes a probabilistic solution for the fault diagnosis in the sequential scan-based circuits. Our approach uses a signal probability analysis to score and rank potential fault locations. The ranking results are exploited to reduce the search space for exact diagnosis approaches. The experimental results show how this technique can increase the scalability and speed of satisfiability (SAT)-based diagnosis approach.","PeriodicalId":187936,"journal":{"name":"2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116964444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}