T. Chan, Kwangsoo Han, A. Kahng, Jae-Gon Lee, S. Nath
{"title":"OCV-aware top-level clock tree optimization","authors":"T. Chan, Kwangsoo Han, A. Kahng, Jae-Gon Lee, S. Nath","doi":"10.1145/2591513.2591541","DOIUrl":"https://doi.org/10.1145/2591513.2591541","url":null,"abstract":"The clock trees of high-performance synchronous circuits have many clock logic cells (e.g., clock gating cells, multiplexers and dividers) in order to achieve aggressive clock gating and required performance across a wide range of operating modes and conditions. As a result, clock tree structures have become very complex and difficult to optimize with automatic clock tree synthesis (CTS) tools. In advanced process nodes, CTS becomes even more challenging due to on-chip variation (OCV) effects. In this paper, we present a new CTS methodology that optimizes clock logic cell placements and buffer insertions in the top level of a clock tree. We formulate the top-level clock tree optimization problem as a linear program that minimizes a weighted sum of timing slacks, clock uncertainty and wirelength. Experimental results in a commercial 28nm FDSOI technology show that our method can improve post-CTS worst negative slack across all modes/corners by up to 320ps compared to a leading commercial provider's CTS flow.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123203595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A memory mapping approach based on network customization to design conflict-free parallel hardware architectures","authors":"Saeed Ur Reehman, C. Chavet, P. Coussy","doi":"10.1145/2591513.2591532","DOIUrl":"https://doi.org/10.1145/2591513.2591532","url":null,"abstract":"Parallel hardware architectures are needed to achieve high throughput systems. Unfortunately, efficient parallel architectures often require removing memory access conflicts. This is particularly true when designing turbo-codes, channel interleaver or LDPC (Low Density Parity Check) codes architectures which are one of the most critical parts of parallel decoders. Many solutions are proposed in state of the art to find conflict free memory mapping but they are either limited to a subset of constraints, or result in high architectural cost. These drawbacks come from the interleaving law and the incompatibility between this law and the targeted interconnection network (in the coder/encoder architecture). In this paper we propose a conflict free memory mapping approach that is able to generate optimized hardware architectures by limiting these drawbacks. The proposed solution constructs a customized interconnection network by analyzing data access patterns defined in the interleaving law. Our approach is then compared to state of the art methods and its interest is shown through the design of parallel interleavers for HSPA.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115888053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Papandreou, Thomas Parnell, H. Pozidis, T. Mittelholzer, E. Eleftheriou, C. Camp, T. Griffin, G. Tressler, Andrew Walls
{"title":"Using adaptive read voltage thresholds to enhance the reliability of MLC NAND flash memory systems","authors":"N. Papandreou, Thomas Parnell, H. Pozidis, T. Mittelholzer, E. Eleftheriou, C. Camp, T. Griffin, G. Tressler, Andrew Walls","doi":"10.1145/2591513.2591594","DOIUrl":"https://doi.org/10.1145/2591513.2591594","url":null,"abstract":"NAND Flash memory is not only the ubiquitous storage medium in consumer applications, but has also started to appear in enterprise storage systems as well. MLC and TLC Flash technology made it possible to store multiple bits in the same silicon area as SLC, thus reducing the cost per amount of data stored. However, at current sub-20nm technology nodes, MLC Flash devices fail to provide the levels of raw reliability, mainly cycling endurance, that are required by typical enterprise applications. Advanced signal-processing and coding schemes are needed to improve the Flash bit error rate and thus elevate the device reliability to the desired level. In this paper, we report on the use of adaptive voltage thresholds in the read operation of NAND Flash devices. We discuss how the optimal read voltage thresholds can be determined, and assess the benefit of adapting the read voltage thresholds in terms of cycling endurance, data retention and resilience to read disturb.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121948505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthias Hartmann, H. Kukner, Prashant Agrawal, P. Raghavan, L. Perre, W. Dehaene
{"title":"Modelling and mitigation of time-zero variability in sub-16nm finfet-based STT-MRAM memories","authors":"Matthias Hartmann, H. Kukner, Prashant Agrawal, P. Raghavan, L. Perre, W. Dehaene","doi":"10.1145/2591513.2591573","DOIUrl":"https://doi.org/10.1145/2591513.2591573","url":null,"abstract":"Spin-transfer torque magnetic RAM (STT-MRAM) is one of the most promising non-volatile memory technologies and shows potential as an SRAM replacement. However, targeted for advanced CMOS technologies such as the 14nm FinFET node, time-zero variability is a major concern for these memory technologies. In this paper, we investigate the STT-MRAM variability with respect to different technology scenarios. We show the impact of these variations on the bit error rate of the emerging STT-MRAM memories.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132800949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Securely outsourcing power grid simulation on cloud","authors":"Naval Gupte, Jia Wang","doi":"10.1145/2591513.2591547","DOIUrl":"https://doi.org/10.1145/2591513.2591547","url":null,"abstract":"Power grid (PG) simulation is critical for verification of supply noises in IC design. Computational demands for simulating PG is high. Cloud computing can be leveraged to mitigate these costs. However, simulating on third-party platforms lead to major security concerns. We propose a framework for secure PG simulation on Cloud. Multiple compression strategies are employed to reduce communication overhead. Turnaround time similar to an insecure simulator on Cloud can be achieved, while securing current excitations and output voltage vectors at reasonable costs.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120961891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Md Shahriar Shamim, N. Mansoor, A. Samaiyar, A. Ganguly, Sujay Deb, S. S. Ram
{"title":"Energy-efficient wireless network-on-chip architecture with log-periodic on-chip antennas","authors":"Md Shahriar Shamim, N. Mansoor, A. Samaiyar, A. Ganguly, Sujay Deb, S. S. Ram","doi":"10.1145/2591513.2591566","DOIUrl":"https://doi.org/10.1145/2591513.2591566","url":null,"abstract":"On-chip wireless interconnects have emerged as a promising alternative to conventional wireline interconnects in Network-on-Chip (NoC) fabrics for multicore systems. However, it is not practical in the immediate future to arbitrarily scale up the number of wireless links without innovations in the physical layer. Here, we explore the design of a directional on-chip antenna based on a log-periodic structure. In this paper we propose the design of a wireless NoC (WiNoC) architecture with concurrent wireless links using these directional on-chip antennas. Through cycle accurate simulations we demonstrate that this novel WiNoC architecture attains better performance and energy efficiency compared to the state-of-the-art token based WiNoC of similar topology.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121328561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel low-power and in-place split-radix FFT processor","authors":"Z. Qian, M. Margala","doi":"10.1145/2591513.2591563","DOIUrl":"https://doi.org/10.1145/2591513.2591563","url":null,"abstract":"Split-radix Fast Fourier Transform (SRFFT) approximates the minimum number of multiplications by theory among all the FFT algorithms. Since multiplications significantly contribute to the overall system power consumption, SRFFT is a good candidate for implementation of a low power FFT processor. In this paper we present a novel low power SRFFT processor using a modified radix-2 butterfly structure. With the proposed butterfly unit, the address generation scheme for conventional radix-2 FFT could be applied to SRFFT and therefore it can avoid the complexity of address generation and interim data registers. Simulation results show that compared with a conventional radix-2 implementation, power consumption of the new processor is reduced by an amount of 11.7% and 18.3% for 16-point and 32-point FFT respectively.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123682794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and analysis of robust and wide operating low-power level-shifter for embedded dynamic random access memory","authors":"Kenneth Ramclam, Swaroop Ghosh","doi":"10.1145/2591513.2591533","DOIUrl":"https://doi.org/10.1145/2591513.2591533","url":null,"abstract":"Level shifters (LS) are crucial components in low power design where the die is segregated in multiple voltage domains. LS are used at the voltage domain interfaces to mitigate sneak path current. Another important application of LS is in high voltage drivers for designs where voltage boosting is needed for performance and functionality. We explore one such application in embedded Dynamic Random Access Memories (eDRAM) where LS is employed in the wordline path. Our investigation reveals that leakage power of LS can pose a serious threat by lowering the wordline voltage and subsequently affecting the speed and retention time of eDRAM. Furthermore the delay of LS under worse case process corners can cause functional discrepancies. We propose low-power pulsed-LS with supply gating to circumvent these issues. Our analysis indicate that pulsed-LS can improve the worst case speed from 2.7%-43%. We also propose power-gating for LSs to improve the retention time and bandwidth with minimal power and area overhead.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129593112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EDA for extreme scale systems: design abstractions, metrics, and benchmarks","authors":"A. Jones","doi":"10.1145/2591513.2597170","DOIUrl":"https://doi.org/10.1145/2591513.2597170","url":null,"abstract":"The context for EDA research is rapidly changing thanks to enhanced and novel switching devices, manufacturing technologies, new application targets, and the increasing soft- ware development effort required for new ICs. These trends continue to expand the gap between the capabilities of systems and what can be utilized by designers. To address these problems requires a collaborative effort with indus- try researchers, academics, and funding agencies working together in close partnership. This talk describes recommendations from the recent CCC workshop series on EDA in the Extreme Scale era for improving the collaboration between IC designers and EDA. There remains a continued importance of effective design abstractions that facilitate research on EDA advances that can be effectively translated to actual design flows for relevant technologies. Further, these abstractions must be accompanied by (i) effective design metrics, especially for new technologies where optimization objectives may not be obvious, and (ii) appropriate benchmarks, especially for more established technologies where alternative optimization techniques must be carefully compared. Focus on these research directions for EDA will have direct impact to reduce the existing capabilities gap between tools and designers.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130459598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marta Ortín-Obón, L. Ramini, H. Tatenguem, V. Viñals, D. Bertozzi
{"title":"A complete electronic network interface architecture for global contention-free communication over emerging optical networks-on-chip","authors":"Marta Ortín-Obón, L. Ramini, H. Tatenguem, V. Viñals, D. Bertozzi","doi":"10.1145/2591513.2591536","DOIUrl":"https://doi.org/10.1145/2591513.2591536","url":null,"abstract":"Although many valuable research works have investigated the properties of optical networks-on-chip (ONoCs), the vast majority of them lack an accurate exploration of the network interface architecture (NI) required to support optical communications on the silicon chip. The complexity of this architecture is especially critical for a specific kind of ONoCs: wavelength-routed ones. From a logical viewpoint, they can be considered as full nonblocking crossbars, thus the control complexity is implemented at the NIs. To our knowledge, this paper proposes the first complete NI architecture for wavelength-routed optical NoCs, by coping with the intricacy of networking issues such as flow control, buffering strategy, deadlock avoidance, serialization, and above all, with their codesign in a complete architecture.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114319873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}