2016 IEEE 34th International Conference on Computer Design (ICCD)最新文献

筛选
英文 中文
SRAM stability analysis for different cache configurations due to Bias Temperature Instability and Hot Carrier Injection 基于偏置温度不稳定性和热载流子注入的SRAM稳定性分析
2016 IEEE 34th International Conference on Computer Design (ICCD) Pub Date : 2016-10-01 DOI: 10.1109/ICCD.2016.7753284
Taizhi Liu, Chang-Chih Chen, Jiadong Wu, L. Milor
{"title":"SRAM stability analysis for different cache configurations due to Bias Temperature Instability and Hot Carrier Injection","authors":"Taizhi Liu, Chang-Chih Chen, Jiadong Wu, L. Milor","doi":"10.1109/ICCD.2016.7753284","DOIUrl":"https://doi.org/10.1109/ICCD.2016.7753284","url":null,"abstract":"Bias Temperature Instability (BTI) and Hot Carrier Injections (HCI) are two of the main effects that increase a transistor's threshold voltage and further cause performance degradations. These two wearout mechanisms affect all transistors, but are especially acute in the SRAM cells of first-level (L1) caches, which are frequently accessed and are critical for microprocessor performance. This work studies the cache lifetimes due to the combined effect of BTI and HCI for different cache configurations, including variation in cache size, associativity, cache line size, and the replacement algorithm. The effect of process variations is also considered. We analyze the reliability (failure probability) and performance (hit rate) of the L1 cache within a LEON3 microprocessor, while the LEON3 is running a set of benchmarks, and we provide essential insights on performance-reliability tradeoffs for cache designers.","PeriodicalId":297899,"journal":{"name":"2016 IEEE 34th International Conference on Computer Design (ICCD)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132534259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Design techniques of eNVM-enabled neuromorphic computing systems 基于envm的神经形态计算系统设计技术
2016 IEEE 34th International Conference on Computer Design (ICCD) Pub Date : 2016-10-01 DOI: 10.1109/ICCD.2016.7753356
Chang Song, Beiye Liu, Chenchen Liu, Hai Helen Li, Yiran Chen
{"title":"Design techniques of eNVM-enabled neuromorphic computing systems","authors":"Chang Song, Beiye Liu, Chenchen Liu, Hai Helen Li, Yiran Chen","doi":"10.1109/ICCD.2016.7753356","DOIUrl":"https://doi.org/10.1109/ICCD.2016.7753356","url":null,"abstract":"The recently emerged research on “neuromorphic computing”, which stands for hardware acceleration of brain-inspired computing, has become one of the most active research areas in computer engineering. In this invited paper, we start with a background introduction of neuromorphic computing, followed by some examples of hardware acceleration schemes of learning and neural network algorithms on emerging nonvolatile memory (eNVM)-based neuromorphic computing engine. At the end, we share our prospects on the future technology challenges and advances of neuromorphic computing.","PeriodicalId":297899,"journal":{"name":"2016 IEEE 34th International Conference on Computer Design (ICCD)","volume":"4 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130045400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Ultra-low energy security circuits for IoT applications 物联网应用的超低能耗安全电路
2016 IEEE 34th International Conference on Computer Design (ICCD) Pub Date : 2016-10-01 DOI: 10.1109/ICCD.2016.7753358
Sudhir K. Satpathy, S. Mathew, Vikram B. Suresh, R. Krishnamurthy
{"title":"Ultra-low energy security circuits for IoT applications","authors":"Sudhir K. Satpathy, S. Mathew, Vikram B. Suresh, R. Krishnamurthy","doi":"10.1109/ICCD.2016.7753358","DOIUrl":"https://doi.org/10.1109/ICCD.2016.7753358","url":null,"abstract":"Low-area energy-efficient security primitives are key building blocks for enabling end-to-end content protection, user authentication, and consumer confidentiality in the IoT world that is estimated to surpass 50billion smart and connected devices by 2020. This paper describes design approaches that blend energy-efficient circuit techniques with optimal accelerator microarchitecture datapath, and hardware friendly arithmetic to achieve ultra-low energy consumption in security platforms for seamless adoption in area/battery constrained and self-powered systems. Industry leading energy-efficiency is demonstrated with three designs, fabricated and measured in advanced process technologies: 1) A 2040-gate arithmetically optimized composite-field Sbox based AES accelerator achieves 289Gbps/W peak energy-efficiency while offering 432Mbps throughput in 22nm tri-gate CMOS, 2) Hybrid Physically Unclonable Function (PUF) circuit leverages burn-in induced aging to reduce bit-error, followed by temporal-majority-voting, dark-bit masking, and error-correction conditioning techniques to generate a 100% stable full-entropy key with 190fJ/bit energy consumption in 22nm tri-gate CMOS. 3) A light-weight all digital TRNG uses in-line correlation suppressor and entropy-extractor circuits to achieve >0.99 min-entropy with 3pJ/bit measured energy-efficiency while operating down to 300mV in 14nm tri-gate CMOS.","PeriodicalId":297899,"journal":{"name":"2016 IEEE 34th International Conference on Computer Design (ICCD)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133302646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
DLL: A dynamic latency-aware load-balancing strategy in 2.5D NoC architecture 2.5D NoC架构中的动态延迟感知负载平衡策略
2016 IEEE 34th International Conference on Computer Design (ICCD) Pub Date : 2016-10-01 DOI: 10.1109/ICCD.2016.7753352
Chen Li, Sheng Ma, Lu Wang, Zicong Wang, Xia Zhao, Yang Guo
{"title":"DLL: A dynamic latency-aware load-balancing strategy in 2.5D NoC architecture","authors":"Chen Li, Sheng Ma, Lu Wang, Zicong Wang, Xia Zhao, Yang Guo","doi":"10.1109/ICCD.2016.7753352","DOIUrl":"https://doi.org/10.1109/ICCD.2016.7753352","url":null,"abstract":"As the 3D stacking technology still faces several challenges, the 2.5D stacking technology gains better application prospects nowadays. With the silicon interposer, the 2.5D stacking can improve the bandwidth and capacity of the memory system. To satisfy the communication requirements of the integrated memory system, the free routing resources in the interposer should be explored to implement an additional network. Yet, the performance is strongly limited by the unbalanced loads between the CPU-layer network and the interposer-layer network. In this paper, to address this issue, we propose a dynamic latency-aware load-balancing (DLL) strategy. Our key innovations are detecting congestion of the network layer via the average latency of recent packets and making the network layer selection at each source node. We leverage the free routing resources in the interposer to implement a latency propagation ring. With the ring, the latency information tracked at destination nodes is propagated back to source nodes. We achieve load-balance by using these information. Experimental results show that compared with the baseline design, a destination-detection strategy and a buffer-aware strategy, our DLL strategy achieves 45%, 14.9% and 6.5% of average throughput improvements with minor overheads.","PeriodicalId":297899,"journal":{"name":"2016 IEEE 34th International Conference on Computer Design (ICCD)","volume":"29 42","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132275549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
FPGA Trust Zone: Incorporating trust and reliability into FPGA designs FPGA信任区:将信任和可靠性纳入FPGA设计
2016 IEEE 34th International Conference on Computer Design (ICCD) Pub Date : 2016-10-01 DOI: 10.1109/ICCD.2016.7753346
V. Jyothi, Manasa Thoonoli, Richard Stern, R. Karri
{"title":"FPGA Trust Zone: Incorporating trust and reliability into FPGA designs","authors":"V. Jyothi, Manasa Thoonoli, Richard Stern, R. Karri","doi":"10.1109/ICCD.2016.7753346","DOIUrl":"https://doi.org/10.1109/ICCD.2016.7753346","url":null,"abstract":"This paper proposes a novel methodology FPGA Trust Zone (FTZ) to incorporate security into the design cycle to detect and isolate anomalies such as Hardware Trojans in the FPGA fabric. Anomalies are identified using violation to spatial correlation of process variation in FPGA fabric. Anomalies are isolated using Xilinx Isolation Design Flow (IDF) methodology. FTZ helps identify and partition the FPGA into areas that are devoid of anomalies and thus, assists to run designs securely and reliably even in an anomaly-infected FPGA. FTZ also assists IDF to select trustworthy areas for implementing isolated designs and trusted routes. We demonstrate the effectiveness of FTZ for AES and RC5 designs on Xilinx Virtex-7 and Atrix-7 FPGAs.","PeriodicalId":297899,"journal":{"name":"2016 IEEE 34th International Conference on Computer Design (ICCD)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115331260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Memos: A full hierarchy hybrid memory management framework 备忘录:一个完整的层次混合内存管理框架
2016 IEEE 34th International Conference on Computer Design (ICCD) Pub Date : 2016-10-01 DOI: 10.1109/ICCD.2016.7753305
Lei Liu, Hao Yang, Yong Li, Mengyao Xie, Lian Li, Chenggang Wu
{"title":"Memos: A full hierarchy hybrid memory management framework","authors":"Lei Liu, Hao Yang, Yong Li, Mengyao Xie, Lian Li, Chenggang Wu","doi":"10.1109/ICCD.2016.7753305","DOIUrl":"https://doi.org/10.1109/ICCD.2016.7753305","url":null,"abstract":"In this paper, we introduce memos, which integrates suitable memory management policies and schedules resources over the entire memory hierarchy in hybrid memory system. Powered by an OS kernel level monitoring tool, memos captures memory patterns online, and then leverages them to guide the memory page placement and data mapping. Experimental results show, on average, memos can benefit memory utilization, contributing to system throughput and QoS by 19.1% and 23.6%. Moreover, memos can reduce the NVM side memory latency by 3~83.3%, energy consumption by 25.1~99%, and benefit the NVM lifetime significantly (40× improvement on average).","PeriodicalId":297899,"journal":{"name":"2016 IEEE 34th International Conference on Computer Design (ICCD)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114725577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Strategies for optimal operating point selection in timing speculative processors 时序推测处理器中最佳工作点选择策略
2016 IEEE 34th International Conference on Computer Design (ICCD) Pub Date : 2016-10-01 DOI: 10.1109/ICCD.2016.7753344
Omid Assare, Rajesh K. Gupta
{"title":"Strategies for optimal operating point selection in timing speculative processors","authors":"Omid Assare, Rajesh K. Gupta","doi":"10.1109/ICCD.2016.7753344","DOIUrl":"https://doi.org/10.1109/ICCD.2016.7753344","url":null,"abstract":"Performance of timing speculative processors relies on strategies for accurate prediction of optimal operating points. In this paper, we develop an efficient process-variation-aware simulation framework and use it to evaluate a range of such timing speculation strategies. Our experiments on a timing speculative processor running applications from the MiBench benchmark suite show that, in a typical case, while a perfect timing speculation strategy can improve throughput by up to 143% over a guardbanded design, the most commonly used approach in the literature achieves only a 21.8% of the potential gains. By improving the speculation accuracy, the new strategies we propose in this paper can realize up to 35.6% of the potential gains, a throughput improvement of 50.9% over a guardbanded design.","PeriodicalId":297899,"journal":{"name":"2016 IEEE 34th International Conference on Computer Design (ICCD)","volume":"577 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115896302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Scalable memory architecture for soft-core processors 用于软核处理器的可扩展内存架构
2016 IEEE 34th International Conference on Computer Design (ICCD) Pub Date : 2016-10-01 DOI: 10.1109/ICCD.2016.7753312
T. Jost, G. Nazar, L. Carro
{"title":"Scalable memory architecture for soft-core processors","authors":"T. Jost, G. Nazar, L. Carro","doi":"10.1109/ICCD.2016.7753312","DOIUrl":"https://doi.org/10.1109/ICCD.2016.7753312","url":null,"abstract":"Restrictions over memory performance have always had a great impact on soft-core processors. The reduced number of ports on FPGAs' block RAMs may limit the exploitation of parallelism on soft-core processors that are implemented on top of these devices. Multiple memory ports on FPGAs are cumbersome and do not scale well, having a high cost in area and power consumption when implemented. In order to mitigate the impact of the memory bottleneck on such devices, we propose a scalable memory architecture for soft-cores. We make use of software-managed memories to build a memory system capable of improving performance and instruction-level parallelism (ILP) on soft-core processors. Results show that our architecture overcomes the limited parallelism realized on a dual-ported processor, reducing execution time by 16.5%. These improvements come with no area costs, as the processor is kept with the same total memory. Automated code transformations implemented within the LLVM compiler keep changes in application code to a minimum. We also show that our architecture scales better when boosting the number of functional units in the system.","PeriodicalId":297899,"journal":{"name":"2016 IEEE 34th International Conference on Computer Design (ICCD)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116359002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
BADGR: A practical GHR implementation for TAGE branch predictors BADGR:用于TAGE分支预测器的实用GHR实现
2016 IEEE 34th International Conference on Computer Design (ICCD) Pub Date : 2016-10-01 DOI: 10.1109/ICCD.2016.7753338
David J. Schlais, Mikko H. Lipasti
{"title":"BADGR: A practical GHR implementation for TAGE branch predictors","authors":"David J. Schlais, Mikko H. Lipasti","doi":"10.1109/ICCD.2016.7753338","DOIUrl":"https://doi.org/10.1109/ICCD.2016.7753338","url":null,"abstract":"In this work, we explore global history register (GHR) implementations for Tagged Geometric length (TAGE) style branch predictors with speculative updates. We break down the requirements to both update and recover TAGE predictors' history registers during normal operation and after mispeculation, discussing where various designs exhibit large checkpoint and/or operation overheads. To reduce these inefficiencies, we introduce BADGR, a novel GHR design for TAGE predictors that lowers power consumption and chip area over naive checkpointing techniques by 90% and 85%, respectively.","PeriodicalId":297899,"journal":{"name":"2016 IEEE 34th International Conference on Computer Design (ICCD)","volume":"183 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131496847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Adaptive and flexible key-value stores through soft data partitioning 通过软数据分区自适应灵活的键值存储
2016 IEEE 34th International Conference on Computer Design (ICCD) Pub Date : 2016-10-01 DOI: 10.1109/ICCD.2016.7753293
B. Hong, Yongkee Kwon, Jung Ho Ahn, John Kim
{"title":"Adaptive and flexible key-value stores through soft data partitioning","authors":"B. Hong, Yongkee Kwon, Jung Ho Ahn, John Kim","doi":"10.1109/ICCD.2016.7753293","DOIUrl":"https://doi.org/10.1109/ICCD.2016.7753293","url":null,"abstract":"Key-value stores such as Memcached have become widely used by cloud and web-service providers. While there has been a significant amount of research done on improving the absolute performance of key-value stores, this work proposes an adaptive and a flexible approach to key-value stores. We first propose soft data partitioning that divides memory into multiple groups within a single node, or a single server process, to enable scale-up of key-value stores, while providing NUMA locality and an adaptive approach that can reduce overall request miss rate. The soft-partitioning enables a flexible Memcached server implementation in a NUMA system through NUMA-aware allocation as well as power-efficient NUMA server operation by migrating frequently accessed key-value pairs among the groups. We also propose an adaptive replacement policy within Memcached server that compares miss rates across the different memory groups to determine a more optimal replacement policy. To overcome the limitation of partitioning, we propose Group Auto-Balancing (GAB) where memory allocation from the different groups can be borrowed to minimize miss rate. Our results improve Memcached throughput by 12.9%, on average, over previously proposed MemC3 algorithm (up to 3.1× for write intensive workloads) while the adaptive replacement policy shows the lowest miss rate on adversarial access patterns.","PeriodicalId":297899,"journal":{"name":"2016 IEEE 34th International Conference on Computer Design (ICCD)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132536547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信