2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)最新文献_第10页

UNTANGLE: Unlocking Routing and Logic Obfuscation Using Graph Neural Networks-based Link Prediction UNTANGLE:使用基于图神经网络的链路预测解锁路由和逻辑混淆

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643476

Lilas Alrahis, Satwik Patnaik, Muhammad Abdullah Hanif, M. Shafique, O. Sinanoglu

{"title":"UNTANGLE: Unlocking Routing and Logic Obfuscation Using Graph Neural Networks-based Link Prediction","authors":"Lilas Alrahis, Satwik Patnaik, Muhammad Abdullah Hanif, M. Shafique, O. Sinanoglu","doi":"10.1109/ICCAD51958.2021.9643476","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643476","url":null,"abstract":"Logic locking aims to prevent intellectual property (IP) piracy and unauthorized overproduction of integrated circuits (ICs). However, initial logic locking techniques were vulnerable to the Boolean satisfiability (SAT)-based attacks. In response, researchers proposed various SAT-resistant locking techniques such as point function-based locking and symmetric interconnection (SAT-hard) obfuscation. We focus on the latter since point function-based locking suffers from various structural vulnerabilities. The SAT-hard logic locking technique, InterLock [1], achieves a unified logic and routing obfuscation that thwarts state-of-the-art attacks on logic locking. In this work, we propose a novel link prediction-based attack, UNTANGLE, that successfully breaks InterLock in an oracle-less setting without having access to an activated IC (oracle). Since InterLock hides selected timing paths in key-controlled routing blocks, UNTANGLE reveals the gates and interconnections hidden in the routing blocks upon formulating this task as a link prediction problem. The intuition behind our approach is that ICs contain a large amount of repetition and reuse cores. Hence, UNTANGLE can infer the hidden timing paths by learning the composition of gates in the observed locked netlist or a circuit library leveraging graph neural networks. We show that circuits withstanding SAT-based and other attacks can be unlocked in seconds with 100% precision using UNTANGLE in an oracle-less setting. UNTANGLE is a generic attack platform (which we also open source [2]) that applies to multiplexer (MUX)-based obfuscation, as demonstrated through our experiments on ISCAS-85 and ITC-99 benchmarks locked using InterLock and random MUX-based locking.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130741555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Peripheral Circuitry Assisted Mapping Framework for Resistive Logic-In-Memory Computing 电阻式内存逻辑计算的外围电路辅助映射框架

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643588

Shuhang Zhang, Hai Helen Li, Ulf Schlichtmann

{"title":"Peripheral Circuitry Assisted Mapping Framework for Resistive Logic-In-Memory Computing","authors":"Shuhang Zhang, Hai Helen Li, Ulf Schlichtmann","doi":"10.1109/ICCAD51958.2021.9643588","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643588","url":null,"abstract":"In-memory computing has been applied in different fields due to its superior speed and energy efficiency. Among a variety of memory technologies that have been explored, resistive memory has widely been adopted for various purposes, including Processing-In-Memory (PIM) for neural networks and Logic-In-Memory (LIM) for general logic operations. PIM has intensively been studied in recent years, while the progress in developing LIM computing falls behind. LIM computing is usually implemented based on MAGIC operations, which require inputs to be aligned regularly along rows or columns in a memory crossbar. As the intermediate data generated during the logic execution are normally scattered across the memory crossbar, alignment operations are inserted to align the data, which often costs numerous cycles and dominates the overall latency. In current MAGIC-based designs, alignment operations induce a significant overhead in either area or latency. Therefore, the Area-Latency-Product (ALP), known as a key metric for circuit performance, still has significant optimization potential in LIM computing. In this work, we leverage peripheral circuitry to conduct alignment operations and propose a novel mapping framework to optimize the latency and area costs. Intermediate data are read out, processed in peripheral circuits, then in parallel written back into target cells of the memory crossbar. The approach eliminates the use of redundant memory cells, leading to area reduction. Moreover, it enables simultaneous alignments of multiple intermediate data, which can decrease the overall latency significantly. Based on simulation results, our proposed mapping framework can achieve around 93% ALP reductions on average compared with prior designs with merely 2.13% total area overhead.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128225264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Improving the Robustness of Redundant Execution with Register File Randomization 利用寄存器文件随机化提高冗余执行的鲁棒性

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643466

I. Tuzov, Pablo Andreu, Laura Medina, Tomás Picornell, A. Robles, P. López, J. Flich, Carles Hernández

{"title":"Improving the Robustness of Redundant Execution with Register File Randomization","authors":"I. Tuzov, Pablo Andreu, Laura Medina, Tomás Picornell, A. Robles, P. López, J. Flich, Carles Hernández","doi":"10.1109/ICCAD51958.2021.9643466","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643466","url":null,"abstract":"Staggered Redundant execution (SRE) is a fault-tolerance mechanism that has been widely deployed in the context of safety-critical applications. SRE not only protects the system in the presence of faults but also helps relaxing safety requirements of individual elements. However, in this paper, we show that SRE does not effectively protect the system against a wide range of faults and thus, new mechanisms to increase the diversity of homogeneous cores are needed. In this paper, we propose Register File Randomization (RFR), a low-cost diversity mechanism that significantly increases the robustness of homogeneous multicores in front of common-cause faults (CCFs) and register file wearout. Our results show that RFR completely removes the failure rate for register file CCFs for certain workloads and reduces by a factor of 5X the impact of stress related register file aging for the workloads analysed. Our implementation requires less than 50 RTL lines of code and the area (FPGA logic) overhead of RFR is less than 0.2% of a 64-bit RISC-V core FPGA implementation.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130987953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Graph Learning-Based Arithmetic Block Identification 基于图学习的算法块识别

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643581

Zhuolun He, Ziyi Wang, Chen Bail, Haoyu Yang, Bei Yu

引用次数: 13

iSTELLAR: intermittent Signature aTtenuation Embedded CRYPTO with Low-Level metAl Routing iSTELLAR:间歇性签名衰减嵌入加密与低级别金属路由

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643540

Jeremy Blackstone, D. Das, Alric Althoff, Shreyas Sen, R. Kastner

引用次数: 0

Design Space Exploration of Approximation-Based Quadruple Modular Redundancy Circuits 基于近似的四模冗余电路的设计空间探索

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643561

Marcello Traiola, Jorge Echavarria, A. Bosio, Jürgen Teich, Ian O’Connor

引用次数: 2

FedSwap: A Federated Learning based 5G Decentralized Dynamic Spectrum Access System 基于联邦学习的5G分散动态频谱接入系统

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643496

Zhihui Gao, Ang Li, Yunfan Gao, Bing Li, Yu Wang, Yiran Chen

{"title":"FedSwap: A Federated Learning based 5G Decentralized Dynamic Spectrum Access System","authors":"Zhihui Gao, Ang Li, Yunfan Gao, Bing Li, Yu Wang, Yiran Chen","doi":"10.1109/ICCAD51958.2021.9643496","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643496","url":null,"abstract":"The era of 5G extends the available spectrum from the microwave band to the millimeter-wave band. The thriving Internet of Things (IoT) also enriches the user equipment (UEs) we used in our daily life, such as smart glasses, smart watches, and drones. With such a larger spectrum and massive UEs, existing dynamic spectrum access (DSA) suffers both low spectrum utilization efficiency and unfair spectrum allocation. Thus, a more sophisticated dynamic spectrum access (DSA) system is required in the 5G context. In this paper, we propose a federated learning based system, FedSwap, the first decentralized DSA system that improves both efficiency and fairness simultaneously. In FedSwap, we deploy an improved multi-agent reinforcement learning (iMARL) algorithm on each UE, enabling UEs to share the spectrum coordinately with fewer collisions. Furthermore, we also propose a novel swapping mechanism for aggregating UEs' models periodically so that UEs can fairly share the spectrum resources. Meanwhile, the sensory data of UEs are not transmitted and hence privacy is protected. We evaluate FedSwap's performance in 5G simulations with various settings. Compared to the state-of-the-art decentralized DSA methods, FedSwap can significantly improve the efficiency and fairness of spectrum utilization.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129097121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

ScaleDNN: Data Movement Aware DNN Training on Multi-GPU ScaleDNN:基于多gpu的数据移动感知DNN训练

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643503

Weizheng Xu, Ashutosh Pattnaik, Geng Yuan, Yanzhi Wang, Youtao Zhang, Xulong Tang

{"title":"ScaleDNN: Data Movement Aware DNN Training on Multi-GPU","authors":"Weizheng Xu, Ashutosh Pattnaik, Geng Yuan, Yanzhi Wang, Youtao Zhang, Xulong Tang","doi":"10.1109/ICCAD51958.2021.9643503","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643503","url":null,"abstract":"Training Deep Neural Networks (DNNs) models is a time-consuming process that requires immense amount of data and computation. To this end, GPUs are widely adopted to accelerate the training process. However, the delivered training performance rarely scales with the increase in the number of GPUs. The major reason behind this is the large amount of data movement that prevents the system from providing the GPUs with the required data in a timely fashion. In this paper, we propose ScaleDNN, a framework that systematically and comprehensively investigates and optimizes data-parallel training on two types of multi-GPU systems (PCIe-based and NVLink-based). Specifically, ScaleDNN performs: i) CPU-centric input batch splitting, ii) mini-batch data pre-loading, and iii) model parameter compression to effectively a) reduce the data movement between the CPU and multiple GPUs, and b) hide the data movement overheads by overlapping the data transfer with the GPU computation. Our experimental results show that ScaleDNN achieves up to 39.38%, with an average of 17.96% execution time saving over modern data parallelism on PCIe-based multi-GPU system. The corresponding execution time reduction on NVLink-based multi-GPU system is up to 19.20% with an average of 10.26%.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123359451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Early Validation of SoCs Security Architecture Against Timing Flows Using SystemC-based VPs 基于系统c的vp的soc安全架构对时序流的早期验证

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643579

Mehran Goli, R. Drechsler

引用次数: 3

SSR: A Skeleton-based Synthesis Flow for Hybrid Processing-in-RRAM Modes SSR:基于骨架的rram模式混合加工合成流程

2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) Pub Date : 2021-11-01 DOI: 10.1109/ICCAD51958.2021.9643493

Feng Wang, Guangyu Sun, Guojie Luo

{"title":"SSR: A Skeleton-based Synthesis Flow for Hybrid Processing-in-RRAM Modes","authors":"Feng Wang, Guangyu Sun, Guojie Luo","doi":"10.1109/ICCAD51958.2021.9643493","DOIUrl":"https://doi.org/10.1109/ICCAD51958.2021.9643493","url":null,"abstract":"Recently, the emerging resistive random access memory (RRAM) shows its potential to construct a processing-in-memory (PIM) architecture. It supports a variety of computation modes, including the digital mode and the analog mode. Both modes can perform parallel computation inside an RRAM crossbar. However, the lack of automatic synthesis flow limits their application scenarios. Although previous works implement several large-scale applications, e.g., image processing algorithms and neural networks, using these two modes, most of their implementations are designed manually or semi-manually. In our view, the lack of a specific application representation is a limiting factor for developing a synthesis flow. Therefore, in this work, we propose the skeleton as an application representation. Users can model applications and their potential parallelism in RRAM with nested skeletons and primitive operations. Then, we propose SSR, a skeleton-based flow that can automatically synthesize large-scale applications to RRAM crossbars. For an application represented in skeletons, SSR first partitions it into the digital part and the potential analog part. After that, SSR optimizes primitive operations and allocates bounding boxes to skeletons for both parts under the guide of pre-synthesis results. Finally, SSR maps bounding boxes of skeletons onto crossbars to enable pipelined computation. Experimental evaluations on several popular applications show that SSR improves throughput, latency, and area multiple times over previous works.","PeriodicalId":370791,"journal":{"name":"2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)","volume":"2020 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114577734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2