Sina Shahhosseini, Yang Ni, Emad Kasaeyan Naeini, M. Imani, A. Rahmani, N. Dutt
{"title":"Flexible and Personalized Learning for Wearable Health Applications using HyperDimensional Computing","authors":"Sina Shahhosseini, Yang Ni, Emad Kasaeyan Naeini, M. Imani, A. Rahmani, N. Dutt","doi":"10.1145/3526241.3530373","DOIUrl":"https://doi.org/10.1145/3526241.3530373","url":null,"abstract":"Health and wellness applications increasingly rely on machine learning techniques to learn end-user physiological and behavioral patterns in everyday settings, posing two key challenges: inability to perform on-device online learning for resource-constrained wearables, and learning algorithms that support privacy-preserving personalization. We exploit a Hyperdimensional computing (HDC) solution for wearable devices that offers flexibility, high efficiency, and performance while enabling on-device personalization and privacy protection. We evaluate the efficacy of our approach using three case studies and show that our system improves performance of training by up to 35.8x compared with the state-of-the-art while offering a comparable accuracy.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127940657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adapt-Flow: A Flexible DNN Accelerator Architecture for Heterogeneous Dataflow Implementation","authors":"Jiaqi Yang, Hao Zheng, A. Louri","doi":"10.1145/3526241.3530311","DOIUrl":"https://doi.org/10.1145/3526241.3530311","url":null,"abstract":"Deep neural networks (DNNs) have been widely applied to various application domains. DNN computation is memory and compute-intensive requiring excessive memory access and a large number of computations. To efficiently implement these applications, several data reuse and parallelism exploitation strategies, called dataflows, have been proposed. Studies have shown that many DNN applications benefit from a heterogeneous dataflow strategy where the dataflow type changes from layer to layer. Unfortunately, very few existing DNN architectures can simultaneously accommodate multiple dataflows due to their limited hardware flexibility. In this paper, we propose a flexible DNN accelerator architecture, called Adapt-Flow, which has the capability of supporting multiple dataflow selections for each DNN layer at runtime. Specifically, the proposed Adapt-Flow architecture consists of (1) a flexible interconnect, (2) a dataflow selection algorithm, and (3) a dataflow mapping technique. The flexible interconnect provides dynamic support for various traffic patterns required by different dataflows. The proposed dataflow selection algorithm selects the optimal dataflow strategy for a given DNN layer with the aim of much improved performance. And the dataflow mapping technique efficiently maps the dataflow amenable to the flexible interconnect. Simulation studies show that the proposed Adapt-Flow architecture reduces execution time by 46%, 78%, 26%, and energy consumption by 45%, 80%, 25% as compared to NVDLA, ShiDianNao, and Eyeriss respectively.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129294328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing Information Security Courses With a Remotely Accessible Side-Channel Analysis Setup","authors":"Abubakr Abdulgadir, J. Kaps, A. Salman","doi":"10.1145/3526241.3530347","DOIUrl":"https://doi.org/10.1145/3526241.3530347","url":null,"abstract":"The ever-increasing security threats to our digital infrastructure im- pose the training of a sufficient number of engineers on real-world equipment and attacks. A significant investment in equipment is often needed to teach hardware security. Additionally, the global COVID-19 pandemic demonstrated that online-accessible educational systems are crucial to the continuity of the teaching process. In this work, we describe our experiment with teaching hardware security using a centralized shared setup that can be accessed remotely by students. Our setup reduces the cost and makes teaching such advanced topics more accessible while keeping the benefits of using real hardware to gain practical experience.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117251012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Darjani, N. Kavand, Shubham Rai, M. Wijtvliet, Akash Kumar
{"title":"ENTANGLE: An Enhanced Logic-locking Technique for Thwarting SAT and Structural Attacks","authors":"A. Darjani, N. Kavand, Shubham Rai, M. Wijtvliet, Akash Kumar","doi":"10.1145/3526241.3530371","DOIUrl":"https://doi.org/10.1145/3526241.3530371","url":null,"abstract":"Among the SAT-resilient logic locking techniques, the Stripped-Functionality-Logic-Locking (SFLL) is the most promising solution which can guard the intellectual property against approximate, sensitization, SAT, and structural attacks which target Point-function techniques. However, even the SFLL technique has been shown to be vulnerable to a recent class of structural attacks that identify the perturbation logic. In this paper, we first categorize all possible classes of attacks on SFLL. Then we propose ENTANGLE a novel logic locking technique built upon SFLL that can resist all of these attacks, including the emerging ML-Based attacks. We test our technique against publicly available SFLL attacks. The implementation results show that ENTANGLE can secure large-sized industrial circuits with an average overhead of 11.6 percent and 9.1 percent for area and power, respectively.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115147435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samuel J. Engers, Cheng Chu, Dawen Xu, Ying Wang, Fan Chen
{"title":"MOCCA: A Process Variation Tolerant Systolic DNN Accelerator using CNFETs in Monolithic 3D","authors":"Samuel J. Engers, Cheng Chu, Dawen Xu, Ying Wang, Fan Chen","doi":"10.1145/3526241.3530380","DOIUrl":"https://doi.org/10.1145/3526241.3530380","url":null,"abstract":"Hardware accelerators based on systolic arrays have become the dominant method for efficient processing of deep neural networks (DNNs). Although such designs provide significant performance improvement compared to its contemporary CPUs or GPUs, their power efficiency and area efficiency are greatly limited by the large computing array and on-chip memory. In this work, we demonstrate that we can further improve the efficiency of systolic accelerators using emerging carbon nanotube field-effect transistors (CNFETs) by stacking the computing logic and on-chip memory on multiple layers and utilizing monolithic 3D (M3D) vias for low-latency communication. We comprehensively explore the design space and present MOCCA, the first process variation tolerable CNFET-based systolic DNN accelerator. We validate MOCCA against previous 2D accelerators on state-of-the-arts DNN models. On average, MOCCA achieves the same throughput with 6.12× and 2.12× improvement respectively on performance and power efficiency in a 2× reduced chip footprint.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114376487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Memristor-based Secure Scan Design against the Scan-based Side-Channel Attacks","authors":"Mengqiang Lu, Aijiao Cui, Yan Shao, G. Qu","doi":"10.1145/3526241.3530345","DOIUrl":"https://doi.org/10.1145/3526241.3530345","url":null,"abstract":"Scan chain design can improve the testability of a circuit while it can be used as a side-channel to access the sensitive information inside a cryptographic chip for the crack of cipher key. To secure the scan design while maintaining its testability, this paper proposes a memristor-based secure scan design. A lock and key scheme is introduced. Physical unclonable function (PUF) is used to generate a unique test key for each chip. When an input test key matches the PUF-based key, the scan chain can be used normally for testing. Otherwise, the data in some scan cells are obfuscated by the random bits, which are generated by reading the status of a memristor. As the random bits do not relate to the original test data, an adversary cannot access useful information from scan chain to deduce the cipher key. The experimental results show that the proposed secure scan design can resist all existing attacks while incurring low overhead. Also, the testability of the original design is not affected.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127066474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"iMAD: An In-Memory Accelerator for AdderNet with Efficient 8-bit Addition and Subtraction Operations","authors":"Shien Zhu, Shiqing Li, Weichen Liu","doi":"10.1145/3526241.3530313","DOIUrl":"https://doi.org/10.1145/3526241.3530313","url":null,"abstract":"Adder Neural Network (AdderNet) is a new type of Convolutional Neural Networks (CNNs) that replaces the computational-intensive multiplications in convolution layers with lightweight additions and subtractions. As a result, AdderNet preserves high accuracy with adder convolution kernels and achieves high speed and power efficiency. In-Memory Computing (IMC) is known as the next-generation artificial-intelligence computing paradigm that has been widely adopted for accelerating binary and ternary CNNs. As AdderNet has much higher accuracy than binary and ternary CNNs, accelerating AdderNet using IMC can obtain both performance and accuracy benefits. However, existing IMC devices have no dedicated subtraction function, and adding subtraction logic may bring larger area, higher power, and degraded addition performance. In this paper, we propose iMAD as an in-memory accelerator for AdderNet with efficient addition and subtraction operations. First, we propose an efficient in-memory subtraction operator at the circuit level and co-optimize the addition performance to reduce the latency and power. Second, we propose an accelerator architecture for AdderNet with high parallelism based on the optimized operators. Third, we propose an IMC-friendly computation pipeline for AdderNet convolution at the algorithm level to further boost the performance. Evaluation results show that our accelerator iMAD achieves 3.25X speedup and 3.55X energy efficiency compared with a state-of-the-art in-memory accelerator.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122032882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Method for Timing-based Information Flow Verification in Hardware Designs","authors":"Khitam M. Alatoun, R. Vemuri","doi":"10.1145/3526241.3530363","DOIUrl":"https://doi.org/10.1145/3526241.3530363","url":null,"abstract":"Timing side channels are a serious threat to the security of hardware designs. By analyzing the execution times of a design, the attacker can expose the secret information. This paper proposes an approach to verify and monitor timing-based information flow properties. In addition, the method can highlight the path that is vulnerable to leakage, making it easier to trace the leaking channel. The method can be used during formal verification, dynamic verification during simulation, post-fabrication validation, and run-time monitoring if one is necessary. The method reduces the overhead of the security model, which helps speed up the verification process and create an efficient run-time hardware monitor. Various timing-based information flow properties from five different hardware designs were verified. The results show that our approach can accurately detect hardware timing channels with lower overhead.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125812606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aibin Yan, Zhen Zhou, Shaojie Wei, Jie Cui, Yong Zhou, Tianming Ni, P. Girard, X. Wen
{"title":"A Highly Robust, Low Delay and DNU-Recovery Latch Design for Nanoscale CMOS Technology","authors":"Aibin Yan, Zhen Zhou, Shaojie Wei, Jie Cui, Yong Zhou, Tianming Ni, P. Girard, X. Wen","doi":"10.1145/3526241.3530321","DOIUrl":"https://doi.org/10.1145/3526241.3530321","url":null,"abstract":"With the advancement of semiconductor technologies, nano-scale CMOS circuits have become more vulnerable to soft errors, such as single-node-upsets (SNUs) and double-node-upsets (DNUs). In order to effectively tolerate DNUs caused by radiation and reduce the delay and area consumption of latches, this paper proposes a DNU resilient latch in the nanoscale CMOS technology. The latch mainly comprises four input-split inverters and four 2-input C-elements. Since all internal nodes are interlocked, the latch can recover from all possible DNUs. Simulation results show that, compared with the state-of-the-art DNU self-recovery latch designs, the proposed latch can save 64.51% transmission delay and 56.88% delay-area-power-product (DAPP) on average, respectively.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122526501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 6A: Special Session -1: Machine Learning and Hardware Attacks","authors":"Qiaoyan Yu","doi":"10.1145/3542692","DOIUrl":"https://doi.org/10.1145/3542692","url":null,"abstract":"","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121525500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}