Rimpy Bishnoi, V. Laxmi, M. Gaur, R. H. Ramlee, Mark Zwolinski
{"title":"CERI: Cost-Effective Routing Implementation Technique for Network-on-Chip","authors":"Rimpy Bishnoi, V. Laxmi, M. Gaur, R. H. Ramlee, Mark Zwolinski","doi":"10.1109/VLSID.2015.15","DOIUrl":"https://doi.org/10.1109/VLSID.2015.15","url":null,"abstract":"To deal with the communication challenges of current and future many-core architectures, Network-on-Chip (NoC) has been proposed as a promising alternative. Regular 2D mesh topology is the most preferred design choice for NoCs. Hardware failures owing to manufacturing, wearout, aging etc., however, may disrupt the regularity of 2D mesh. Sustaining routing under these circumstances becomes a challenge. Though traditional table based routing method is flexible enough to handle any irregularity, it is neither scalable nor cost-effective solution. Scalable distributed logic based solutions like uLBDR have limited flexibility and work only in restricted architectural space despite complex switch design. To overcome these limitations, this paper presents CERI (CostEffective Routing Implementation), an efficient logic based routing capable of handling failure-induced irregularities in 2D mesh. Implementation of proposed approach does not require tables or a complex switch design. Performance analysis of CERI demonstrates its cost effectiveness as area and power requirements are reduced respectively by (14%) and (16%) than previously proposed logic based solution uLBDR.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129268465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Lavratti, L. Bolzani, F. Vargas, A. Calimera, E. Macii
{"title":"Evaluating a Hardware-Based Approach for Detecting Resistive-Open Defects in SRAMs","authors":"F. Lavratti, L. Bolzani, F. Vargas, A. Calimera, E. Macii","doi":"10.1109/VLSID.2015.74","DOIUrl":"https://doi.org/10.1109/VLSID.2015.74","url":null,"abstract":"Advances in Very Deep Sub-Micron (VDSM) technology have made possible the integration of millions of transistors into a small area and consequently, has increased the circuit's density. The increase of Nano-Scale Static Random Access Memories (SRAMs) density has become an important concern for testing, since generated new types of defects that can occur during the manufacturing process. The rapidly increasing need to store more information results in the fact that the memory elements occupy great part of the System-on-Chip's (SoC) silicon area. In this context, the present paper describes and evaluates a technique based on On-Chip Current Sensors (OCCS) and Neighbourhood Comparison Logic (NCL) to detect resistive-open defects in SRAMs. Experimental results obtained throughout simulations demonstrate the technique's efficiency as well as its behaviour considering process variation. To conclude, an analysis of the overheads makes possible the comparison with today's standard techniques.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124170003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tutorial T6: FinFET Device Circuit Co-design: Issues and Challenges","authors":"S. Dasgupta, B. Anand","doi":"10.1109/VLSID.2015.114","DOIUrl":"https://doi.org/10.1109/VLSID.2015.114","url":null,"abstract":"The race to the next process node of FinFETs becomes more prominent after the Intel's & TSMC's announcement to use tri-gate technology (FinFETs) commercially in below 20nm node. Last year, the revealed the 16nm FinFET process that by many measures is one of the most advanced semiconductor technologies. Most of the other semiconductor industries/foundries are expected to adopt FinFETs at 16/14 nm in order to keep pace imposed by the Intel and TSMC. However, similar to the problems faced by any new technology, FinFETs with sub-20 nm feature size also faces several design challenges. Most of these challenges arise due to technological restriction that again degrades its performances. Although, some performance boosters such as high permittivity spacers, enhances the device characteristics but has limited applicability in high-performance circuit applications. Researchers also explored various physical configurations/architectures to alleviate device-circuit co-design to improve the overall performance. However, contradictory observations have been made with respect to device and circuit immunity to random variations that result in an ambiguity about their true applicability. Therefore, it is necessary to thoroughly investigate these novel device architectures with their circuit suitability and tolerance to random variations. Therefore, this tutorial explores the possibilities of dual-spacer (symmetric and asymmetric) architecture for the purpose and its impact of high performance logic circuit/SRAM applications with its tolerance limits to random variations.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132601432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Panda, Vishal Patel, Praxal Shah, Namita Sharma, V. Srinivasan, D. Sarma
{"title":"Power Optimization Techniques for DDR3 SDRAM","authors":"P. Panda, Vishal Patel, Praxal Shah, Namita Sharma, V. Srinivasan, D. Sarma","doi":"10.1109/VLSID.2015.59","DOIUrl":"https://doi.org/10.1109/VLSID.2015.59","url":null,"abstract":"With memory contributing to a significant fraction of the overall power consumption, several power management techniques targeting the memory sub-system have been proposed by researchers. In this work, we propose two memory power optimization techniques. We first suggest dynamically varying the queue structure in a memory controller, and next propose an adaptive threshold technique for switching to low power SELF-REFRESH operating mode of the DDR3 SDRAM. With the proposed queue-resizing optimization, the power consumption reduces by up to 93%, while the adaptive threshold technique results in an additional 21% power savings, on an average, over a constant threshold implementation.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131744282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RELSPEC: A Framework for Early Reliability Refinement of Embedded Applications","authors":"S. Ghosh, Aritra Hazra, Soumyajit Dey","doi":"10.1109/VLSID.2015.12","DOIUrl":"https://doi.org/10.1109/VLSID.2015.12","url":null,"abstract":"The increasing complexity of safety-critical embedded applications have made it imperative to specify and analyze reliability upfront in the design flow so that reliable systems can be automatically synthesized adhering to such descriptions. This paper develops a framework, RELSPEC, to express the reliability of a safety-critical embedded application at an early-stage of the design flow and enables the reliability analysis leveraging automatically constructed intermediate probabilistic models of the system. Further, our analysis provides a mechanized way to refine the reliability in order to meet a target reliability value of the overall system. Experiments over few automotive case-studies show the efficacy of this methodology.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133843588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gopinath Mahale, H. Mahale, Arnav Goel, S. Nandy, S. Bhattacharya, R. Narayan
{"title":"Hardware Solution for Real-Time Face Recognition","authors":"Gopinath Mahale, H. Mahale, Arnav Goel, S. Nandy, S. Bhattacharya, R. Narayan","doi":"10.1109/VLSID.2015.19","DOIUrl":"https://doi.org/10.1109/VLSID.2015.19","url":null,"abstract":"The objective of this paper is to come up with a scalable modular hardware solution for real-time Face Recognition (FR) on large databases. Existing hardware solutions use algorithms with low recognition accuracy suitable for real-time response. In addition, database size for these solutions is limited by on-chip resources making them unsuitable for practical real-time applications. Due to high computational complexity we do not choose algorithms in literature with superior recognition accuracy. Instead, we come up with a combination of Weighted Modular Principle Component Analysis (WMPCA) and Radial Basis Function Neural Network (RBFNN) which outperforms algorithms used in existing hardware solutions on highly illumination and pose variant face databases. We propose a hardware solution for real-time FR which uses parallel streams to perform independent modular computations. A salient feature of proposed hardware solution is that we store a major part of data on off-chip memory in a novel format, so that latencies experienced accessing off-chip memory does not impact performance. This enables us to work on databases of very large sizes. To test functional correctness, the proposed architecture is synthesized and tested on Virtex-6 LX550T FPGA. This emulated system is able to perform 450 recognitions per second on images of size 128 × 128 with 450 classes.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125217174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Diagnostic Tests for Pre-bond TSV Defects","authors":"Bei Zhang, V. Agrawal","doi":"10.1109/VLSID.2015.71","DOIUrl":"https://doi.org/10.1109/VLSID.2015.71","url":null,"abstract":"Pre-bond testing and defect identification of through silicon via (TSV) is extremely important for yield assurance of 3D stacked devices. Based on a recently published pre-bond TSV probing technique, this paper proposes an ILP (integer linear programming) model to generate near-optimal set of sessions for pre-bond TSV test. The sessions generated by our model identify defective TSVs in a TSV network with the same capability as that of other available heuristic methods, but with consistently reduced test time. The ILP model is shown to reduce the pre-bond TSV test time by 38.2% for pinpointing up to two faulty TSVs in an 11-TSV network. Reducing prebond TSV test time helps reduce the manufacturing cost of 3D stacked devices. This ILP model has low complexity and an example demonstration using a commercial solver takes less than 40 seconds.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"209 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123326897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Peak Power Estimation Using Probabilistic Cost-Benefit Analysis","authors":"Hadi Hajimiri, Kamran Rahmani, P. Mishra","doi":"10.1109/VLSID.2015.68","DOIUrl":"https://doi.org/10.1109/VLSID.2015.68","url":null,"abstract":"Estimation of peak power consumption is an essential task in order to design reliable systems. Optimistic design choices can make the circuit unreliable and vulnerable to power attacks, whereas pessimistic design can lead to unacceptable design overhead. The power virus problem is defined as finding input patterns that can maximize switching activity (dynamic power dissipation) in digital circuits. In this paper, we present a fast and simple to implement power virus generation technique utilizing a probabilistic cost-benefit analysis. To maximize switching activity, our proposed algorithm iteratively enables transitions in high fan-out gates while considering the trade-off between switching of new gates (benefit) and blocking of gate transitions in the future iterations (cost) due to switching of the currently selected one. Extensive experiments using both combinational and sequential benchmarks demonstrate that our approach can achieve up to 64% more toggles (30.7% on average) for zero-delay model and improvements of up to 319% (109% on average) for unit-delay model compared to the state-of-the-art techniques.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131488642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design of a Compact Reversible Carry Look-Ahead Adder Using Dynamic Programming","authors":"N. J. Lisa, H. Babu","doi":"10.1109/VLSID.2015.46","DOIUrl":"https://doi.org/10.1109/VLSID.2015.46","url":null,"abstract":"This paper presents a new method for designing a reversible carry look-ahead adder (RCLA) based on dynamic programming. In this method, we propose a faster technique for generating carry output, which also outperforms the existing ones in terms of number of operations. In addition, we design a compact reversible carry look-ahead circuit based on the proposed technique. In order to optimize our design, we propose a first ever known Reversible Partial Adder (RPA) circuit with the optimum numbers of the quantum cost and garbage outputs which concurrently produce carry propagation signal, carry generation signal and summation of the inputs. Using RPA as a unit element of RCLA construction, we optimize the designs of RCLA and show that the proposed design is better than the existing ones in terms of the number of gates, quantum cost, garbage outputs and delay with the help of Micro wind DSCH 3.5, e.g., The proposed 128-bit adder improves 77.55% on number of gates, 10% on garbage outputs, 2.16% on delay and 77.61% on quantum cost over the existing best one.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127275365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Mohammadi, N. Satpute, Rohit Ronge, J. Chandiramani, S. Nandy, Aamir Raihan, Tanmay Verma, R. Narayan, S. Bhattacharya
{"title":"A Flexible Scalable Hardware Architecture for Radial Basis Function Neural Networks","authors":"M. Mohammadi, N. Satpute, Rohit Ronge, J. Chandiramani, S. Nandy, Aamir Raihan, Tanmay Verma, R. Narayan, S. Bhattacharya","doi":"10.1109/VLSID.2015.91","DOIUrl":"https://doi.org/10.1109/VLSID.2015.91","url":null,"abstract":"Radial Basis Function Neural Networks (RBFNN) are used in variety of applications such as pattern recognition, control and time series prediction and nonlinear identification. RBFNN with Gaussian Function as the basis function is considered for classification purpose. Training is done offline using K-means clustering method for center learning and Pseudo inverse for weight adjustments. Offline training is done since the objective function with any fixed set of weights can be computed and we can see whether we make any progress in training. Moreover, minimum of the objective function can be computed to any desired precision, while with online training none of these can be done and it is more difficult and unreliable. In this paper we provide the comparison of RBFNN implementation on FPGAs using soft core processor based multi-processor system versus a network of Hyper Cells [8], [13]. Next we propose three different partitioning structures (Linear, Tree and Hybrid) for the implementation of RBFNN of large dimensions. Our results show that implementation of RBFNN on a network of Hyper Cells using Hybrid Structure, has on average 26x clock cycle reduction and 105X improvement in the performance over that of multi-processor system on FPGAs.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127852343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}