N. Venkateswaran, R. Hariharan, V. Srinivasan, R. Kannan, P. Thinakaran, Vigneshwaran Sankaran, B. Vasudevan, Ravindhiran Mukundrajan, N. Nachiappan, Aswinkumar Sridharan, Karthikeyan P. Saravanan, Vignesh Adhinarayanan, V. Sankaranarayanan
{"title":"SCOC IP Cores for Custom Built Supercomputing Nodes","authors":"N. Venkateswaran, R. Hariharan, V. Srinivasan, R. Kannan, P. Thinakaran, Vigneshwaran Sankaran, B. Vasudevan, Ravindhiran Mukundrajan, N. Nachiappan, Aswinkumar Sridharan, Karthikeyan P. Saravanan, Vignesh Adhinarayanan, V. Sankaranarayanan","doi":"10.1109/ISVLSI.2012.80","DOIUrl":"https://doi.org/10.1109/ISVLSI.2012.80","url":null,"abstract":"A high performance and low power node architecture becomes crucial in the design of future generation supercomputers. In this paper, we present a generic set of cells for designing complex functional units that are capable of executing an algorithm of reasonable size. They are called Algorithm Level Functional Units (ALFUs) and a suitable VLSI design paradigm for them is proposed in this paper. We provide a comparative analysis of many core processors based on ALFUs against ALUs to show the reduced generation of control signals and lesser number of memory accesses, instruction fetches along with increased cache hit rates, resulting in better performance and power consumption. ALFUs have led to the inception of the Super Computer On Chip (SCOC) IP core paradigm for designing high performance and low power supercomputing clusters. The proposed SCOC IP cores are compared with the existing IP cores used in supercomputing clusters to bring out the improved features of the former.","PeriodicalId":398850,"journal":{"name":"2012 IEEE Computer Society Annual Symposium on VLSI","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134396717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthew Morrison, Matthew Lewandowski, N. Ranganathan
{"title":"Design of a Tree-Based Comparator and Memory Unit Based on a Novel Reversible Logic Structure","authors":"Matthew Morrison, Matthew Lewandowski, N. Ranganathan","doi":"10.1109/ISVLSI.2012.61","DOIUrl":"https://doi.org/10.1109/ISVLSI.2012.61","url":null,"abstract":"Programmable reversible logic is gain wide consideration as a logic design style for modern nanotechnology and quantum computing with minimal impact on circuit heat generation in improved computer architecture and arithmetic logic unit designs. In this paper, a 2*2 Swap gate which is a reduced implementation in terms of quantum cost and delay to the previous Swap gate is presented. Then, a novel 3*3 programmable UPG gate capable of calculating the universal logic calculations is presented and verified, and its advantages over the Toffoli and Peres gates are discussed. The UPG is then implemented in a reduced design for calculating n-bit AND, n-bit OR and n-bit ZERO calculations. Then, two 3*3 RMUX gates capable of multiplexing two input values with reduced quantum cost and delay compared to the previously existing Fred kin gate is presented and verified. Next, a novel 4*4 reversible programmable RC gate capable of nine unique logical calculations at low cost and delay is presented and verified. The UPG and RC are implemented in the design of novel sequential and tree-based comparators. These designs are compared to previously existing designs, and their advantages in terms of cost and delay are analyzed. Then, the RMUX is used to improve a reversible SRAM cell we previously presented. The memory cell and comparator are implemented in the design of a Min/Max Comparator device.","PeriodicalId":398850,"journal":{"name":"2012 IEEE Computer Society Annual Symposium on VLSI","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134045383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Vudadha, P. Phaneendra, Syed Ershad Ahmed, S. Veeramachaneni, N. Muthukrishnan, M. Srinivas
{"title":"Design and Analysis of Reversible Ripple, Prefix and Prefix-Ripple Hybrid Adders","authors":"C. Vudadha, P. Phaneendra, Syed Ershad Ahmed, S. Veeramachaneni, N. Muthukrishnan, M. Srinivas","doi":"10.1109/ISVLSI.2012.50","DOIUrl":"https://doi.org/10.1109/ISVLSI.2012.50","url":null,"abstract":"Reversible computing has emerged as promising technology having its applications in quantum computing, nanotechnology and optical computing. This paper presents design and analysis of reversible ripple, prefix and prefix ripple hybrid adders. Firstly an analysis and comparison of all the existing reversible ripple carry adders is presented. The reversible ripple carry adders are characterized by high quantum depth, low quantum cost and/or low garbage outputs and ancilla inputs bits. Secondly design methodology for reversible prefix adders is presented. The reversible prefix adders are characterized by low quantum depth, high quantum cost and/or high garbage outputs and ancilla inputs bits. Finally design of the proposed reversible prefix-ripple hybrid adders is presented and comparison of the different parameters of reversible ripple, prefix and prefix-ripple hybrid adders is illustrated.","PeriodicalId":398850,"journal":{"name":"2012 IEEE Computer Society Annual Symposium on VLSI","volume":"704 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132879188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Lee, Seungcheol Baek, Jongman Kim, C. Nicopoulos
{"title":"A Compression-Based Hybrid MLC/SLC Management Technique for Phase-Change Memory Systems","authors":"H. Lee, Seungcheol Baek, Jongman Kim, C. Nicopoulos","doi":"10.1109/ISVLSI.2012.62","DOIUrl":"https://doi.org/10.1109/ISVLSI.2012.62","url":null,"abstract":"The storage density of PCM has been demonstrated to double through the employment of Multi-Level Cell (MLC) PCM arrays. However, this increase in capacity comes at the expense of increased latency (both read and write) and decreased long-term endurance, as compared to the more conventional Single-Level Cell (SLC) PCM. These negative traits of MLCs detract from the potentially invaluable storage benefits. This paper introduces a compression-based hybrid MLC/SLC PCM management technique that aims to combine the performance edge of SLCs with the higher capacity of MLCs in a hybrid environment. Our trace-driven simulations with real application workloads demonstrate that the proposed technique achieves 3.6X performance enhancement and 72% energy reduction, on average, as compared with MLC-only configurations, while always providing the same effective capacity as the MLC-only mode.","PeriodicalId":398850,"journal":{"name":"2012 IEEE Computer Society Annual Symposium on VLSI","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129380293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross-Layer Techniques for Optimizing Systems Utilizing Memories with Asymmetric Access Characteristics","authors":"Yong Li, A. Jones","doi":"10.1109/ISVLSI.2012.65","DOIUrl":"https://doi.org/10.1109/ISVLSI.2012.65","url":null,"abstract":"Non-volatile memory technologies promise a variety of advantages for memory architectures of next generation computing systems. However, these capabilities come at the cost of some inefficiencies governing the operation of these memories. The most well understood is the asymmetry of access. In order to most effectively take advantage of the benefits of these memory technologies in terms of density and reduced static power in systems while mitigating access complexity an one-size fits all method is not sufficient for all types of applications. Instead, cross-layer techniques that include the compiler, operating system, and hardware layer can extract characteristics from the application that can be used to deliver the highest possible performance while minimizing power consumption for systems using these memories.","PeriodicalId":398850,"journal":{"name":"2012 IEEE Computer Society Annual Symposium on VLSI","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131104109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Geostatistical-Inspired Metamodeling and Optimization of Nano-CMOS Circuits","authors":"Oghenekarho Okobiah, S. Mohanty, E. Kougianos","doi":"10.1109/ISVLSI.2012.12","DOIUrl":"https://doi.org/10.1109/ISVLSI.2012.12","url":null,"abstract":"With the continuous progression of semiconductor technology, nanoscale effects have become a persistent issue in the design of analog/mixed-signal (AMS) circuits. The cost of exploration and optimization of the design space increases to infeasible levels with conventional design methodologies. Different modeling techniques to reduce the cost of design exploration, while ensuring the accuracy of such models, have been introduced and continue to be a research problem. In this paper, a geostatistical inspired metamodeling and optimization technique is presented for fast and accurate design optimization of nano-CMOS circuits. The proposed design methodology incorporates a simple Kriging based metamodel which efficiently and accurately predicts design performance. The metamodel (instead of the circuit netlist) is subjected to a Gravitational Search Algorithm for optimization. This design methodology is applicable to AMS circuits and is illustrated with the optimization of power consumption of a 45nm CMOS thermal sensor. The method improves the power performance of the thermal sensor by 36.9% while reducing the design optimization time by 90% even with 6 design parameters.","PeriodicalId":398850,"journal":{"name":"2012 IEEE Computer Society Annual Symposium on VLSI","volume":"167 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116095877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NVMain: An Architectural-Level Main Memory Simulator for Emerging Non-volatile Memories","authors":"Matthew Poremba, Yuan Xie","doi":"10.1109/ISVLSI.2012.82","DOIUrl":"https://doi.org/10.1109/ISVLSI.2012.82","url":null,"abstract":"Emerging non-volatile memory (NVM) technologies, such as PCRAM and STT-RAM, have demonstrated great potentials to be the candidates as replacement for DRAM-based main memory design for computer systems. It is important for computer architects to model such emerging memory technologies at the architecture level, to understand the benefits and limitations for better utilizing them to improve the performance/energy/reliability of future computing systems. In this paper, we introduce an architectural-level simulator called NV Main, which can model main memory design with both DRAM and emerging non-volatile memory technologies, and can facilitate designers to perform design space explorations utilizing these emerging memory technologies. We discuss design points of the simulator and provide validation of the model, along with case studies on using the tool for design space explorations.","PeriodicalId":398850,"journal":{"name":"2012 IEEE Computer Society Annual Symposium on VLSI","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128446666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Delay Analysis for an N-Input Current Mode Threshold Logic Gate","authors":"Chandra Babu Dara, T. Haniotakis, S. Tragoudas","doi":"10.1109/ISVLSI.2012.34","DOIUrl":"https://doi.org/10.1109/ISVLSI.2012.34","url":null,"abstract":"A recent approach is capable of identifying threshold logic functions with as many as fifty inputs with small integer weights on the inputs. An analytical method is presented for selecting optimum sensor sizes. This allows us to design large threshold functions with delay much less than a network of CMOS gates. Exhaustive SPICE simulations show that implemented TLGs by the proposed approach consistently exhibit behavior very close to the optimal.","PeriodicalId":398850,"journal":{"name":"2012 IEEE Computer Society Annual Symposium on VLSI","volume":"137 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126299346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data-Width-Driven Power Gating of Integer Arithmetic Circuits","authors":"T. Hoang, P. Larsson-Edefors","doi":"10.1109/ISVLSI.2012.59","DOIUrl":"https://doi.org/10.1109/ISVLSI.2012.59","url":null,"abstract":"When performing narrow-width computations, power gating of unused arithmetic circuit portions can significantly reduce leakage power. We deploy coarse-grain power gating in 32-bit integer arithmetic circuits that frequently will operate on narrow-width data. Our contributions include a design framework that automatically implements coarse-grain power-gated arithmetic circuits considering a narrow-width input data mode, and an analysis of the impact of circuit architecture on the efficiency of this data-width-driven power gating scheme. As an example, with a performance penalty of 6.7%, coarse-grain power gating of a 45-nm 32-bit multiplier is demonstrated to yield an 11.6× static leakage energy reduction per 8×8-bit operation.","PeriodicalId":398850,"journal":{"name":"2012 IEEE Computer Society Annual Symposium on VLSI","volume":"38 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126202514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Vudadha, P. Phaneendra, S. Veeramachaneni, Syed Ershad Ahmed, N. Muthukrishnan, M. Srinivas
{"title":"Design of Prefix-Based Optimal Reversible Comparator","authors":"C. Vudadha, P. Phaneendra, S. Veeramachaneni, Syed Ershad Ahmed, N. Muthukrishnan, M. Srinivas","doi":"10.1109/ISVLSI.2012.49","DOIUrl":"https://doi.org/10.1109/ISVLSI.2012.49","url":null,"abstract":"This paper presents a design of prefix grouping based reversible comparator. Reversible computing has emerged as promising technology having its applications in emerging technologies like quantum computing, optical computing etc. The proposed reversible comparator design consists of three stages. The first stage consists of a 1-bit comparator where two outputs, gi indicating Ai >; Bi and ei indicating Ai = Bi, are generated for ith operand bits. The outputs of 1-bit comparator stage are grouped in the second stage using prefix grouping and the final outputs G indicating A >; B and E indicating A=B are generated. In the last stage the outputs of second stage i.e. G and E are used to generate L signal indicating A <; B. The proposed 64-bit comparator design results in 63% reduced quantum delay, 21% reduced quantum cost and 16% reduced garbage outputs when compared with the best existing design of tree based comparator.","PeriodicalId":398850,"journal":{"name":"2012 IEEE Computer Society Annual Symposium on VLSI","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134388027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}