2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)最新文献_第8页

Split Compilation for Security of Quantum Circuits 量子电路安全的分离编译

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643478

Abdullah Ash-Saki, A. Suresh, R. Topaloglu, Swaroop Ghosh

{"title":"Split Compilation for Security of Quantum Circuits","authors":"Abdullah Ash-Saki, A. Suresh, R. Topaloglu, Swaroop Ghosh","doi":"10.1109/ICCAD51958.2021.9643478","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643478","url":null,"abstract":"An efficient quantum circuit (program) compiler aims to minimize the gate-count - through efficient instruction translation, routing, gate, and cancellation - to improve run-time and noise. Therefore, a high-efficiency compiler is paramount to enable the game-changing promises of quantum computers. To date, the quantum computing hardware providers are offering a software stack supporting their hardware. However, several third-party software toolchains, including compilers, are emerging. They support hardware from different vendors and potentially offer better efficiency. As the quantum computing ecosystem becomes more popular and practical, it is only prudent to assume that more companies will start offering software-as-a-service for quantum computers, including high-performance compilers. With the emergence of third-party compilers, the security and privacy issues of quantum intellectual properties (IPs) will follow. A quantum circuit can include sensitive information such as critical financial analysis and proprietary algorithms. Therefore, submitting quantum circuits to untrusted compilers creates opportunities for adversaries to steal IPs. In this paper, we present a split compilation methodology to secure IPs from untrusted compilers while taking advantage of their optimizations. In this methodology, a quantum circuit is split into multiple parts that are sent to a single compiler at different times or to multiple compilers. In this way, the adversary has access to partial information. With analysis of over 152 circuits on three IBM hardware architectures, we demonstrate the split compilation methodology can completely secure IPs (when multiple compilers are used) or can introduce factorial time reconstruction complexity while incurring a modest overhead (~ 3% to ~ 6% on average).","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126820779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Binarized SNNs: Efficient and Error-Resilient Spiking Neural Networks through Binarization 二值化snn:基于二值化的高效抗错峰值神经网络

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643463

M. Wei, Mikail Yayla, S. Ho, Jian-Jia Chen, Chia-Lin Yang, H. Amrouch

{"title":"Binarized SNNs: Efficient and Error-Resilient Spiking Neural Networks through Binarization","authors":"M. Wei, Mikail Yayla, S. Ho, Jian-Jia Chen, Chia-Lin Yang, H. Amrouch","doi":"10.1109/ICCAD51958.2021.9643463","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643463","url":null,"abstract":"Spiking Neural Networks (SNNs) are considered the third generation of NNs and can reach similar accuracy as conventional deep NNs, but with a considerable improvement in efficiency. However, to achieve high accuracy, state-of-the-art SNNs employ stochastic spike coding of the inputs, requiring multiple cycles of computation. Because of this and due to the nature of analog computing, it is required to accumulate and hold the charges of multiple cycles, necessitating a large membrane capacitor. This results in high energy, long latency, and expensive area costs, constituting one of the major bottlenecks in analog SNN implementations. Membrane capacitor size determines the precision of the firing time. Hence reducing the capacitor size considerably degrades the inference accuracy. To alleviate this, we focus on bridging the gap between binarized NNs (BNNs) and SNNs. BNNs are rapidly emerging as an attractive alternative for NNs due to their high efficiency and error tolerance. In this work, we evaluate the impact of deploying error-resilient BNNs, i.e. BNNs that have been proactively trained in the presence of errors, on analog implementation of SNNs. We show that for BNNs, the capacitor size and latency can be reduced significantly compared to state-of-the-art SNNs, which employ multi-bit models. Our experiments demonstrate that when error-resilient BNNs are deployed on analog-based SNN accelerator, the size of the membrane capacitor is reduced by 50%, the inference latency is decreased by two orders of magnitude, and energy is reduced by 57% compared to the baseline 4-bit SNN implementation, under minimal accuracy cost.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128199812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

HASHTAG: Hash Signatures for Online Detection of Fault-Injection Attacks on Deep Neural Networks HASHTAG:用于深度神经网络故障注入攻击在线检测的哈希签名

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643556

Mojan Javaheripi, F. Koushanfar

{"title":"HASHTAG: Hash Signatures for Online Detection of Fault-Injection Attacks on Deep Neural Networks","authors":"Mojan Javaheripi, F. Koushanfar","doi":"10.1109/ICCAD51958.2021.9643556","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643556","url":null,"abstract":"We propose Hashtag, the first framework that enables high-accuracy detection of fault-injection attacks on Deep Neural Networks (DNNs) with provable bounds on detection performance. Recent literature in fault-injection attacks shows the severe DNN accuracy degradation caused by bit flips. In this scenario, the attacker changes a few weight bits during DNN execution by tampering with the program's DRAM memory. To detect runtime bit flips, Hashtag extracts a unique signature from the benign DNN prior to deployment. The signature is later used to validate the integrity of the DNN and verify the inference output on the fly. We propose a novel sensitivity analysis scheme that accurately identifies the most vulnerable DNN layers to the fault-injection attack. The DNN signature is then constructed by encoding the underlying weights in the vulnerable layers using a low-collision hash function. When the DNN is deployed, new hashes are extracted from the target layers during inference and compared against the ground-truth signatures. Hashtag incorporates a lightweight methodology that ensures a low-overhead and real-time fault detection on embedded platforms. Extensive evaluations with the state-of-the-art bit-flip attack on various DNNs demonstrate the competitive advantage of Hashtag in terms of both attack detection and execution overhead.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123862305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

DARe: DropLayer-Aware Manycore ReRAM architecture for Training Graph Neural Networks DARe:用于训练图神经网络的DropLayer-Aware多核ReRAM架构

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643511

Aqeeb Iqbal Arka, Biresh Kumar Joardar, J. Doppa, P. Pande, K. Chakrabarty

{"title":"DARe: DropLayer-Aware Manycore ReRAM architecture for Training Graph Neural Networks","authors":"Aqeeb Iqbal Arka, Biresh Kumar Joardar, J. Doppa, P. Pande, K. Chakrabarty","doi":"10.1109/ICCAD51958.2021.9643511","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643511","url":null,"abstract":"Graph Neural Networks (GNNs) are a variant of Deep Neural Networks (DNNs) operating on graphs. GNNs have attributes of both DNNs and graph computation. However, training GNNs on manycore architectures is a challenging task because it involves heavy communication that bottlenecks performance. DropEdge and Dropout, which we collectively refer to as DropLayer, are regularization techniques that can improve the predictive accuracy of GNNs. Moreover, when implemented on a manycore architecture, DropEdge and Dropout are capable of reducing the on-chip traffic. In this paper, we present a ReRAM-based 3D manycore architecture called DARe, tailored for accelerating on-chip training of GNNs. The key component of the DARe architecture is a Network-on-Chip (NoC) that reduces the amount of communication using DropLayer. The reduced traffic prevents communication hotspots and leads to better performance. We demonstrate that DARe outperforms conventional GPUs by up to 6.7X (5.6X on average) in terms of execution time, while being up to 30X (23X on average) more energy efficient for GNN training.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116133461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

2021 ICCAD CAD Contest Problem C: GPU Accelerated Logic Rewriting 2021 ICCAD CAD竞赛题目C: GPU加速逻辑重写

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643521

G. Pasandi, Sreedhar Pratty, David Brown, Yanqing Zhang, Haoxing Ren, Brucek Khailany

引用次数: 3

Circuit Deobfuscation from Power Side-Channels using Pseudo-Boolean SAT 伪布尔SAT对电源侧信道的去混淆

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643495

Kaveh Shamsi, Yier Jin

{"title":"Circuit Deobfuscation from Power Side-Channels using Pseudo-Boolean SAT","authors":"Kaveh Shamsi, Yier Jin","doi":"10.1109/ICCAD51958.2021.9643495","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643495","url":null,"abstract":"The problem of inferring the value of internal nets in a circuit from its power side-channels has been the topic of extensive research over the past two decades, with several frameworks developed mostly focusing on cryptographic hardware. In this paper, we focus on the problem of breaking logic locking, a technique in which an original circuit is made ambiguous by inserting unknown “key” bits into it, via power side-channels. We present a pair of attack algorithms we term PowerSAT attacks, which take in arbitrary keyed circuits and resolve key information by interacting adaptively with a side-channel “oracle”. They are based on the query-by-disagreement scheme used in functional SAT attacks against locking but utilize Psuedo-Boolean constraints to allow for reasoning about hamming-weight power models. We present a software implementation of the attacks along with techniques for speeding them up. We present simulation and FPGA-based experiments as well. Notably, we demonstrate the extraction of a 32-bit key from a comparator circuit with a $2^{31}$ functional query complexity, in $sim 64$ chosen power side-channel queries using the PowerSAT attack, where traditional CPA fails given 1000 random traces. We release a binary of our implementation along with the FPGA $+mathbf{scope} mathbf{HDL}/mathbf{setup}$ used for the experiments.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"153 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122688011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A High-Performance Accelerator for Super-Resolution Processing on Embedded GPU 嵌入式GPU超分辨率处理的高性能加速器

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643472

W. Zhao, Qi Sun, Yang Bai, Wenbo Li, Haisheng Zheng, Bei Yu, Martin D. F. Wong

引用次数: 1

GraphLily: Accelerating Graph Linear Algebra on HBM-Equipped FPGAs 在配备hbm的fpga上加速图线性代数

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643582

Yuwei Hu, Yixiao Du, Ecenur Ustun, Zhiru Zhang

{"title":"GraphLily: Accelerating Graph Linear Algebra on HBM-Equipped FPGAs","authors":"Yuwei Hu, Yixiao Du, Ecenur Ustun, Zhiru Zhang","doi":"10.1109/ICCAD51958.2021.9643582","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643582","url":null,"abstract":"Graph processing is typically memory bound due to low compute to memory access ratio and irregular data access pattern. The emerging high-bandwidth memory (HBM) delivers exceptional bandwidth by providing multiple channels that can service memory requests concurrently, thus bringing the potential to significantly boost the performance of graph processing. This paper proposes GraphLily, a graph linear algebra overlay, to accelerate graph processing on HBM-equipped FPGAs. GraphLily supports a rich set of graph algorithms by adopting the GraphBLAS programming interface, which formulates graph algorithms as sparse linear algebra operations. GraphLily provides efficient, memory-optimized accelerators for the two widely-used kernels in GraphBLAS, namely, sparse-matrix dense-vector multiplication (SpMV) and sparse-matrix sparse-vector multiplication (SpMSpV). The SpMV accelerator uses a sparse matrix storage format tailored to HBM that enables streaming, vectorized accesses to each channel and concurrent accesses to multiple channels. Besides, the SpMV accelerator exploits data reuse in accesses of the dense vector by introducing a scalable on-chip buffer design. The SpMSpV accelerator complements the SpMV accelerator to handle cases where the input vector has a high sparsity. GraphLily further builds a middleware to provide runtime support. With this middleware, we can port existing GraphBLAS programs to FPGAs with slight modifications to the original code intended for CPU/GPU execution. Evaluation shows that compared with state-of-the-art graph processing frameworks on CPUs and GPUs, GraphLily achieves up to 2.5 x and 1.1 x higher throughput, while reducing the energy consumption by 8.1 x and 2.4 x; compared with prior single-purpose graph accelerators on FPGAs, GraphLily achieves 1.2 x -1.9 x higher throughput.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124845755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 36

Traffic-Adaptive Power Reconfiguration for Energy-Efficient and Energy-Proportional Optical Interconnects 高能效和能量比例光互连的流量自适应功率重构

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643475

Yuyang Wang, K. Cheng

引用次数: 0

Simultaneous Transistor Folding and Placement in Standard Cell Layout Synthesis 标准单元布局合成中晶体管的同步折叠和放置

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643537

Kyeonghyeon Baek, Taewhan Kim

{"title":"Simultaneous Transistor Folding and Placement in Standard Cell Layout Synthesis","authors":"Kyeonghyeon Baek, Taewhan Kim","doi":"10.1109/ICCAD51958.2021.9643537","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643537","url":null,"abstract":"The three major tasks in standard cell layout synthesis are transistor folding, transistor placement, and in-cell routing, which are tightly inter-related, but generally performed one at a time to reduce the extremely high complexity of design space. In this paper, we propose an integrated approach to the two problems of transistor folding and placement. Precisely, we propose a globally optimal algorithm of search tree based design space exploration, devising a set of effective speeding up techniques as well as dynamic programming based fast cost computation. In addition, our algorithm incorporates the minimum OD (oxide diffusion) jog constraint, which closely relies on both of transistor folding and placement. To our knowledge, this is the first work that tries to simultaneously solve the two problems. Through experiments with the transistor netlists and design rules in the ASAP 7nm library, it is shown that our proposed method is able to synthesize fully routable cell layouts of minimal size within 1 second for each netlist, outperforming the cell layout quality in the ASAP 7nm library, which otherwise, may take several hours or days to manually complete layouts of the quality level comparable to ours.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126996354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1