Justin Rilling, David Graziano, Jamin Hitchcock, Tim Meyer, Xinying Wang, Phillip H. Jones, Joseph Zambreno
{"title":"Circumventing a ring oscillator approach to FPGA-based hardware Trojan detection","authors":"Justin Rilling, David Graziano, Jamin Hitchcock, Tim Meyer, Xinying Wang, Phillip H. Jones, Joseph Zambreno","doi":"10.1109/ICCD.2011.6081411","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081411","url":null,"abstract":"Ring oscillators are commonly used as a locking mechanism that binds a hardware design to a specific area of silicon within an integrated circuit (IC). This locking mechanism can be used to detect malicious modifications to the hardware design, also known as a hardware Trojan, in situations where such modifications result in a change to the physical placement of the design on the IC. However, careful consideration is needed when designing ring oscillators for such a scenario to guarantee the integrity of the locking mechanism. This paper presents a case study in which flaws discovered in a ring oscillator-based Trojan detection scheme allowed for the circumvention of the security mechanism and the implementation of a large and diverse set of hardware Trojans, limited only by hardware resources.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128988609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed thermal management for embedded heterogeneous MPSoCs with dedicated hardware accelerators","authors":"Yen-Kuan Wu, Shervin Sharifi, T. Simunic","doi":"10.1109/ICCD.2011.6081395","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081395","url":null,"abstract":"This paper addresses thermal management in heterogeneous MPSoCs where the power states of the general purpose cores can be controlled by the operating system (OS) while OS is not able to control power states of the dedicated hardware accelerators (DHAs). We propose a scalable and cooperative distributed thermal management technique 1 which works based on the cooperation of local controllers deployed in some of the cores. Through low overhead message passing, these controllers communicate in order to exchange temperature and performance related information which is used to find the best thermally safe set of frequency settings for the cores. Experimental results show that for our technique can successfully reduce the deadline miss rate by 47.16% in average compared to localized thermal management techniques while successfully satisfying temperature constraints.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122076828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low power, high throughput network-on-chip fabric for 3D multicore processors","authors":"V. Nandakumar, M. Marek-Sadowska","doi":"10.1109/ICCD.2011.6081458","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081458","url":null,"abstract":"Long wires degrade significantly the performance of network-on-chip (NoC) communication fabric in large multicore processors. 3D network-on-chip architecture alleviates the problem of long wires, but practical limitations of CMOS technology restrict such structures to two active layers only. In this work, we study a heterogeneous 3D chip with processor cores and cache blocks implemented in CMOS and NoC fabric in VeSFET tech-nology. Such a 3D architecture shows significant improvements in all network parameters including latency, power and energy consumption compared to existing 3D NoCs.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124249493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Moradi, G. Panagopoulos, G. Karakonstantis, D. Wisland, H. Mahmoodi, J. K. Madsen, K. Roy
{"title":"Multi-level wordline driver for low power SRAMs in nano-scale CMOS technology","authors":"F. Moradi, G. Panagopoulos, G. Karakonstantis, D. Wisland, H. Mahmoodi, J. K. Madsen, K. Roy","doi":"10.1109/ICCD.2011.6081419","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081419","url":null,"abstract":"In this paper, a multi-level wordline driver scheme is presented to improve SRAM read and write stability while lowering power consumption during hold operation. The proposed circuit applies a shaped wordline voltage pulse during read mode and a boosted wordline pulse during write mode. During read, the applied shaped pulse is tuned at nominal voltage for short period of time, whereas for the remaining access time, the wordline voltage is reduced to a lower level. This pulse results in improved read noise margin without any degradation in access time which is explained by examining the dynamic and nonlinear behavior of the SRAM cell. Furthermore, during hold mode, the wordline voltage starts from a negative value and reaches zero voltage, resulting in a lower leakage current compared to conventional SRAM. Our simulations using TSMC 65nm process show that the proposed wordline driver results in 2X improvement in static read noise margin while the write margin is improved by 3X. In addition, the total leakage of the proposed SRAM is reduced by 10% while the total power is improved by 12% in the worst case scenario of a single SRAM cell. The total area penalty is 10% for a 128Kb standard SRAM array.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117095140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparative analysis of copper and CNT interconnects for H-tree clock distribution","authors":"Vish Ganti, H. Mahmoodi","doi":"10.1109/ICCD.2011.6081443","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081443","url":null,"abstract":"Clock distribution network is an important part of digital integrated circuits. The clock signal carried by the distribution network has to reach every end node at the same time to ensure synchronized switching. Due to mismatches among different nodes of the H-tree, the clock transitions among the final nodes of the distribution tree show some time difference, the maximum of which is called clock skew. In modern CMOS technologies, copper interconnect is popular for high level interconnects such as clock and power routing. Carbon Nanotube (CNT) exhibits less resistivity than copper making it a better material for interconnect. This paper compares the impact on clock skew of H-tree clock distribution network by replacing the traditional copper interconnects with carbon nanotube interconnects. By applying temperature mismatch, threshold voltage mismatch, and process mismatch, our findings show that using carbon nanotube interconnects reduces the clock skew significantly compared to traditional copper interconnects.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122802555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A GALS Network-on-Chip based on rationally-related frequencies","authors":"Jean-Michel Chabloz, A. Hemani","doi":"10.1109/ICCD.2011.6081369","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081369","url":null,"abstract":"GALS Networks-on-Chip (NoCs) in which the frequency of every switch can be set independently would enable per-node DVFS without requiring asynchronous switch design. However, traditional GALS interfaces introduce high latency penalties and are therefore ill-suited for inter-switch links in a NoC. In this paper we introduce and study a GALS Network-on-Chip based on the Globally-Ratiochronous, Locally-Synchronous (GRLS) paradigm. GRLS constrains all switch frequencies to be rationally-related but enables the use of efficient interfaces which reduce the latency of the network 60% compared to GALS solutions and obtains better throughput-per-power ratios compared to synchronous and mesochronous solutions.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133402449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Lee, Seungcheol Baek, C. Nicopoulos, Jongman Kim
{"title":"An energy- and performance-aware DRAM cache architecture for hybrid DRAM/PCM main memory systems","authors":"H. Lee, Seungcheol Baek, C. Nicopoulos, Jongman Kim","doi":"10.1109/ICCD.2011.6081427","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081427","url":null,"abstract":"The last few years have witnessed the emergence of a promising new memory technology. Phase-Change Memory (PCM) is increasingly viewed as an attractive alternative for the memory sub-system of future microprocessor architectures, mainly because of its inherent ability to scale deeply into the nanoscale regime, and its low power consumption. However, PCM's write performance is its Achilles' heel, especially when compared to the prevalent DRAM technology. This weakness necessitates the deployment of hybridized solutions that fuse DRAM and PCM, in order to attain high overall system performance. In this paper, we set out to explore how various DRAM/PCM hybrid configurations affect system performance and energy consumption, and then proceed with the presentation of a novel architecture that maximizes performance without adversely affecting power efficiency. An energy-delay product improvement of 42.2%, on average, over conventional hybrid structures, is demonstrated.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133304579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Runtime adaptable concurrent error detection for linear digital systems","authors":"Yu Liu, Kaijie Wu","doi":"10.1109/ICCD.2011.6081406","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081406","url":null,"abstract":"In response to the rising fault susceptibility of ICs due to aggressive device scaling, a number of concurrent error detection (CED) techniques have been proposed. The existing circuit- or logic- level CED techniques aim at the worst case of fault susceptibility. Recognizing that the energy consumption of the circuitry with different CED capability varies significantly, these techniques could result in significant overhead for today's deep sub-micron devices that suffer from strong variation of fault susceptibility. In this paper, we propose a novel RT-level CED technique for linear digital systems. The proposed technique offers run-time adaptable CED so that devices will never overpay the energy bills for their CED needs.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123202726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy-efficient multi-level cell phase-change memory system with data encoding","authors":"Jue Wang, Xiangyu Dong, Guangyu Sun, Dimin Niu, Yuan Xie","doi":"10.1109/ICCD.2011.6081394","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081394","url":null,"abstract":"Phase-change memory (PCM) is one of the most promising technologies among emerging non-volatile memories. Recently, the technology of multi-level cell (MLC) for PCM has been developed and a high capacity memory system can be implemented by storing multiple bits in a cell. However, programming MLC PCM involves the program-and-verify scheme. Thus, the energy of programming intermediate states in MLC PCM is considerably larger than that of single-level cell (SLC) PCM. To mitigate the MLC energy overhead, we propose an energy-efficient PCM architecture using data encoding write based on the observation that there are significant value-dependent energy variations in programming MLC PCM. In addition, data comparison write (DCW) is adopted to enhance the effectiveness of the proposed data encoding architecture for MLC PCM. Simulation results show that this encoding architecture achieves 9.6% average energy saving (up to 19.8%) on the plain MLC PCM system, and 12.9% average energy saving (up to 26.7%) on the DCW-adopted MLC PCM system1.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114814205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Gschwind, V. Salapura, Catherine Trammell, S. Mckee
{"title":"SoftBeam: Precise tracking of transient faults and vulnerability analysis at processor design time","authors":"M. Gschwind, V. Salapura, Catherine Trammell, S. Mckee","doi":"10.1109/ICCD.2011.6081430","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081430","url":null,"abstract":"To study system reliability of a next-generation system, we undertake a soft error vulnerability study for a next-generation microprocessor design. Starting from design data for the entire processor, we extend the microprocessor verification methodology to study soft error propagation through microprocessor logic into the architected processor state. We use soft error injection into randomly selected latch bits to (1) identify areas for improvement, (2) derate technology susceptibility by architectural, microarchitectural, and logic masking resulting in increased soft error resilience; and (3) identify areas where microarchitectural data corruption can be tolerated as performance degradation without impact on correctness, yielding even greater soft error resilience. Based on these results, we reduce design vulnerability to soft errors by factors ranging from 2 for an execution unit to more than 32 for a memory management unit.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115305505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}