2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)最新文献

筛选
英文 中文
DWE: Decrypting Learning with Errors with Errors DWE:用错误解密学习
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC) Pub Date : 2018-06-01 DOI: 10.1145/3195970.3196032
S. Bian, Masayuki Hiromoto, Takashi Sato
{"title":"DWE: Decrypting Learning with Errors with Errors","authors":"S. Bian, Masayuki Hiromoto, Takashi Sato","doi":"10.1145/3195970.3196032","DOIUrl":"https://doi.org/10.1145/3195970.3196032","url":null,"abstract":"The Learning with Errors (LWE) problem is a novel foundation of a variety of cryptographic applications, including quantumly-secure public-key encryption, digital signature, and fully homomorphic encryption. In this work, we propose an approximate decryption technique for LWE-based cryptosystems. Based on the fact that the decryption process for such systems is inherently approximate, we apply hardware-based approximate computing techniques. Rigorous experiments have shown that the proposed technique simultaneously achieved 1.3× (resp., 2.5×) speed increase, 2.06× (resp., 7.89×) area reduction, 20.5% (resp., 4×) of power reduction, and an average of 27.1% (resp., 65.6%) ciphertext size reduction for public-key encryption scheme (resp., a state-of-the-art fully homomorphic encryption scheme).","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"25 7 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83946161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Dynamic Management of Key States for Reinforcement Learning-assisted Garbage Collection to Reduce Long Tail Latency in SSD 基于强化学习辅助垃圾回收的关键状态动态管理以减少SSD长尾延迟
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC) Pub Date : 2018-06-01 DOI: 10.1145/3195970.3196034
Won-Kyung Kang, S. Yoo
{"title":"Dynamic Management of Key States for Reinforcement Learning-assisted Garbage Collection to Reduce Long Tail Latency in SSD","authors":"Won-Kyung Kang, S. Yoo","doi":"10.1145/3195970.3196034","DOIUrl":"https://doi.org/10.1145/3195970.3196034","url":null,"abstract":"Garbage collection (GC) is one of main causes of the long-tail latency problem in storage systems. Long-tail latency due to GC is more than 100 times greater than the average latency at the 99th percentile. Therefore, due to such a long tail latency, real-time systems and quality-critical systems cannot meet the system requirements. In this study, we propose a novel key state management technique of reinforcement learning-assisted garbage collection. The purpose of this study is to dynamically manage key states from a significant number of state candidates. Dynamic management enables us to utilize suitable and frequently recurring key states at a small area cost since the full states do not have to be managed. The experimental results show that the proposed technique reduces by 22–25% the long-tail latency compared to a state-of-the-art scheme with real-world workloads.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"76 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76905046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
CMP-PIM: An Energy-Efficient Comparator-based Processing-In-Memory Neural Network Accelerator CMP-PIM:一种基于比较器的高效内存处理神经网络加速器
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC) Pub Date : 2018-06-01 DOI: 10.1145/3195970.3196009
Shaahin Angizi, Zhezhi He, A. S. Rakin, Deliang Fan
{"title":"CMP-PIM: An Energy-Efficient Comparator-based Processing-In-Memory Neural Network Accelerator","authors":"Shaahin Angizi, Zhezhi He, A. S. Rakin, Deliang Fan","doi":"10.1145/3195970.3196009","DOIUrl":"https://doi.org/10.1145/3195970.3196009","url":null,"abstract":"In this paper, an energy-efficient and high-speed comparator-based processing-in-memory accelerator (CMP-PIM) is proposed to efficiently execute a novel hardware-oriented comparator-based deep neural network called CMPNET. Inspired by local binary pattern feature extraction method combined with depthwise separable convolution, we first modify the existing Convolutional Neural Network (CNN) algorithm by replacing the computationally-intensive multiplications in convolution layers with more efficient and less complex comparison and addition. Then, we propose a CMP-PIM that employs parallel computational memory sub-array as a fundamental processing unit based on SOT-MRAM. We compare CMP-PIM accelerator performance on different data-sets with recent CNN accelerator designs. With the close inference accuracy on SVHN data-set, CMP-PIM can get ~ 94× and 3× better energy efficiency compared to CNN and Local Binary CNN (LBCNN), respectively. Besides, it achieves 4.3× speed-up compared to CNN-baseline with identical network configuration.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"27 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86043124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 75
Extensive Evaluation of Programming Models and ISAs Impact on Multicore So Error Reliability 编程模型和isa对多核So错误可靠性影响的广泛评估
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC) Pub Date : 2018-06-01 DOI: 10.1145/3195970.3196050
F. Rosa, Vitor V. Bandeira, R. Reis, Luciano Ost
{"title":"Extensive Evaluation of Programming Models and ISAs Impact on Multicore So Error Reliability","authors":"F. Rosa, Vitor V. Bandeira, R. Reis, Luciano Ost","doi":"10.1145/3195970.3196050","DOIUrl":"https://doi.org/10.1145/3195970.3196050","url":null,"abstract":"To take advantage of the performance enhancements provided by multicore processors, new instruction set architectures (ISAs) and parallel programming libraries have been investigated across multiple industrial segments. This paper investigates the impact of parallelization libraries and distinct ISAs on the soft error reliability of two multicore ARM processor models (i.e., Cortex-A9 and Cortex-A72), running Linux Kernel and benchmarks with up to 87 billion instructions. An extensive soft error evaluation with more than 1.2 million simulation hours, considering ARMv7 and ARMv8 ISAs and the NAS Parallel Benchmark (NPB) suite is presented.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"75 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88786574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Efficient Batch Statistical Error Estimation for Iterative Multi-level Approximate Logic Synthesis 迭代多级近似逻辑综合的有效批量统计误差估计
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC) Pub Date : 2018-06-01 DOI: 10.1145/3195970.3196038
Sanbao Su, Yi Wu, Weikang Qian
{"title":"Efficient Batch Statistical Error Estimation for Iterative Multi-level Approximate Logic Synthesis","authors":"Sanbao Su, Yi Wu, Weikang Qian","doi":"10.1145/3195970.3196038","DOIUrl":"https://doi.org/10.1145/3195970.3196038","url":null,"abstract":"Approximate computing is an emerging energy-efficient paradigm for error-resilient applications. Approximate logic synthesis (ALS) is an important field of it. To improve the existing ALS flows, one key issue is to derive a more accurate and efficient batch error estimation technique for all approximate transformations under consideration. In this work, we propose a novel batch error estimation method based on Monte Carlo simulation and local change propagation. It is generally applicable to any statistical error measurement such as error rate and average error magnitude. We applied the technique to an existing state-of-the-art ALS approach and demonstrated its effectiveness in deriving better approximate circuits.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"11 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85232613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
A Neuromorphic Design Using Chaotic Mott Memristor with Relaxation Oscillation 基于松弛振荡的混沌Mott忆阻器的神经形态设计
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC) Pub Date : 2018-06-01 DOI: 10.1145/3195970.3195977
Bonan Yan, Xiong Cao, Hai Li
{"title":"A Neuromorphic Design Using Chaotic Mott Memristor with Relaxation Oscillation","authors":"Bonan Yan, Xiong Cao, Hai Li","doi":"10.1145/3195970.3195977","DOIUrl":"https://doi.org/10.1145/3195970.3195977","url":null,"abstract":"The recent proposed nanoscale Mott memristor features negative differential resistance and chaotic dynamics. This work proposes a novel neuromorphic computing system that utilizes Mott memristors to simplify peripheral circuitry. According to the analytic description of chaotic dynamics and relaxation oscillation, we carefully tune the working point of Mott memristors to balance the chaotic behavior weighing testing accuracy and training efficiency. Compared with conventional designs, the proposed design accelerates the training by 1.893× averagely and saves 27.68% and 43.32% power consumption with 36.67% and 26.75% less area for single-layer and two-layer perceptrons, respectively.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"55 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79836166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Architecture Decomposition in System Synthesis of Heterogeneous Many-Core Systems 异构多核系统综合中的体系结构分解
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC) Pub Date : 2018-06-01 DOI: 10.1145/3195970.3195995
Valentina Richthammer, T. Schwarzer, S. Wildermann, J. Teich, Michael Glass
{"title":"Architecture Decomposition in System Synthesis of Heterogeneous Many-Core Systems","authors":"Valentina Richthammer, T. Schwarzer, S. Wildermann, J. Teich, Michael Glass","doi":"10.1145/3195970.3195995","DOIUrl":"https://doi.org/10.1145/3195970.3195995","url":null,"abstract":"Determining feasible application mappings for Design Space Exploration (DSE) and run-time embedding is a challenge for modern many-core systems. The underlying NP-complete system-synthesis problem faces tremendously complex problem instances due to the hundreds of heterogeneous processing elements, their communication infrastructure, and the resulting number of mapping possibilities. Thus, we propose to employ a search-space splitting (SSS) technique using architecture decomposition to increase the performance of existing design-time and run-time synthesis approaches. The technique first restricts the search for application embeddings to selected sub-architectures at substantially reduced complexity; therefore, the complete architecture needs to be searched only in case no embedding is found on any sub-system. Furthermore, we introduce a basic learning mechanism to detect promising sub-architectures and subsequently restrict the search to those. We exemplify the SSS for a SAT-based and a problem-specific backtracking-based system synthesis as part of DSE for NoC-based many-core systems. Experimental results show drastically reduced execution times (≈ 15–50 × on a 24×24 architecture) and an enhanced quality of the embedding, since less mappings (≈ 20–40 ×, compared to the non-decomposing procedures) need to be discarded due to a timeout.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"19 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79942829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Noise-Aware DVFS Transition Sequence Optimization for Battery-Powered IoT Devices 电池供电物联网设备的噪声感知DVFS转换序列优化
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC) Pub Date : 2018-06-01 DOI: 10.1145/3195970.3196080
Shaoheng Luo, Cheng Zhuo, H. Gan
{"title":"Noise-Aware DVFS Transition Sequence Optimization for Battery-Powered IoT Devices","authors":"Shaoheng Luo, Cheng Zhuo, H. Gan","doi":"10.1145/3195970.3196080","DOIUrl":"https://doi.org/10.1145/3195970.3196080","url":null,"abstract":"Low power system-on-chips (SoCs) are now at the heart of Internet-of-Things (IoT) devices, which are well known for their bursty workloads and limited energy storage — usually in the form of tiny batteries. To ensure battery lifetime, DVFS has become an essential technique in such SoC chips. With continuously decreasing supply level, noise margins in these devices are already being squeezed. During DVFS transition, large current that accompanies the clock speed transition runs into or out of clock networks in a few clock cycles, and induces large Ldi/dt noise, thereby stressing the power delivery network (PDN). Due to the limited area and cost target, adding additional decap to mitigate such noise is usually challenging. A common approach is to gradually introduce/remove the additional clock cycles to increase or reduce the clock frequency in steps, a.k.a., clock skipping. However, such a technique may increase DVFS transition time, and still cannot guarantee minimal noise. In this work, we propose a new noise-aware DVFS sequence optimization technique by formulating a mixed 0/1 programming to resolve the problems of clock skipping sequence optimization. Moreover, the method is also extended to schedule extensive wake-up activities on different clock domains for the same purpose. The results show that we are able to achieve minimal-noise sequence within desired transition time with 53% noise reduction and save more than 15–17% power compared with the traditional approach.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"235 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87083034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Area-Optimized Low-Latency Approximate Multipliers for FPGA-based Hardware Accelerators 基于fpga硬件加速器的区域优化低延迟近似乘法器
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC) Pub Date : 2018-06-01 DOI: 10.1145/3195970.3195996
Salim Ullah, Semeen Rehman, B. Prabakaran, F. Kriebel, Muhammad Abdullah Hanif, M. Shafique, Akash Kumar
{"title":"Area-Optimized Low-Latency Approximate Multipliers for FPGA-based Hardware Accelerators","authors":"Salim Ullah, Semeen Rehman, B. Prabakaran, F. Kriebel, Muhammad Abdullah Hanif, M. Shafique, Akash Kumar","doi":"10.1145/3195970.3195996","DOIUrl":"https://doi.org/10.1145/3195970.3195996","url":null,"abstract":"The architectural differences between ASICs and FPGAs limit the effective performance gains achievable by the application of ASIC-based approximation principles for FPGA-based reconfigurable computing systems. This paper presents a novel approximate multiplier architecture customized towards the FPGA-based fabrics, an efficient design methodology, and an open-source library. Our designs provide higher area, latency and energy gains along with better output accuracy than those offered by the state-of-the-art ASIC-based approximate multipliers. Moreover, compared to the multiplier IP offered by the Xilinx Vivado, our proposed design achieves up to 30%, 53%, and 67% gains in terms of area, latency, and energy, respectively, while incurring an insignificant accuracy loss (on average, below 1% average relative error). Our library of approximate multipliers is open-source and available online at https://cfaed.tudresden.de/pd-downloads to fuel further research and development in this area, and thereby enabling a new research direction for the FPGA community.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"36 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80863855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
Long Live TIME: Improving Lifetime for Training-In-Memory Engines by Structured Gradient Sparsification 长寿命时间:通过结构化梯度稀疏化提高记忆中训练引擎的寿命
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC) Pub Date : 2018-06-01 DOI: 10.1145/3195970.3196071
Yi Cai, Yujun Lin, Lixue Xia, Xiaoming Chen, Song Han, Yu Wang, Huazhong Yang
{"title":"Long Live TIME: Improving Lifetime for Training-In-Memory Engines by Structured Gradient Sparsification","authors":"Yi Cai, Yujun Lin, Lixue Xia, Xiaoming Chen, Song Han, Yu Wang, Huazhong Yang","doi":"10.1145/3195970.3196071","DOIUrl":"https://doi.org/10.1145/3195970.3196071","url":null,"abstract":"Deeper and larger Neural Networks (NNs) have made breakthroughs in many fields. While conventional CMOS-based computing platforms are hard to achieve higher energy efficiency. RRAM-based systems provide a promising solution to build efficient Training-In-Memory Engines (TIME). While the endurance of RRAM cells is limited, it’s a severe issue as the weights of NN always need to be updated for thousands to millions of times during training. Gradient sparsification can address this problem by dropping off most of the smaller gradients but introduce unacceptable computation cost. We proposed an effective framework, SGS-ARS, including Structured Gradient Sparsification (SGS) and Aging-aware Row Swapping (ARS) scheme, to guarantee write balance across whole RRAM crossbars and prolong the lifetime of TIME. Our experiments demonstrate that 356× lifetime extension is achieved when TIME is programmed to train ResNet-50 on Imagenet dataset with our SGS-ARS framework.","PeriodicalId":6491,"journal":{"name":"2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)","volume":"32 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88167624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信