{"title":"LeHDC","authors":"Shijin Duan, Yejia Liu, Shaolei Ren, Xiaolin Xu","doi":"10.1145/3489517.3530593","DOIUrl":"https://doi.org/10.1145/3489517.3530593","url":null,"abstract":"Thanks to the tiny storage and efficient execution, hyperdimensional Computing (HDC) is emerging as a lightweight learning framework on resource-constrained hardware. Nonetheless, the existing HDC training relies on various heuristic methods, significantly limiting their inference accuracy. In this paper, we propose a new HDC framework, called LeHDC, which leverages a principled learning approach to improve the model accuracy. Concretely, LeHDC maps the existing HDC framework into an equivalent Binary Neural Network architecture, and employs a corresponding training strategy to minimize the training loss. Experimental validation shows that LeHDC outperforms previous HDC training strategies and can improve on average the inference accuracy over 15% compared to the baseline HDC.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128032000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nishant Gupta, M. S. Desai, M. Wijtvliet, Shubham Rai, Akash Kumar
{"title":"DELTA","authors":"Nishant Gupta, M. S. Desai, M. Wijtvliet, Shubham Rai, Akash Kumar","doi":"10.1145/3489517.3530666","DOIUrl":"https://doi.org/10.1145/3489517.3530666","url":null,"abstract":"This paper presents a stealthy triggering mechanism that reduces the dependencies of analog hardware Trojans on the frequent toggling of the software-controlled rare nets. The trigger to activate the Trojan is generated by using a glitch generation circuit and a clock signal, which increases the selectivity and feasibility of the trigger signal. The proposed trigger is able to evade the state-of-the-art run-time detection (R2D2) and Built-In Acceleration Structure (BIAS) schemes. Furthermore, the simulation results show that the proposed trigger circuit incurs a minimal overhead in side-channel footprints in terms of area (29 transistors), delay (less than 1ps in the clock cycle), and power (1μW).","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130003381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bipolar vector classifier for fault-tolerant deep neural networks","authors":"Suyong Lee, Insu Choi, Joon-Sung Yang","doi":"10.1145/3489517.3530498","DOIUrl":"https://doi.org/10.1145/3489517.3530498","url":null,"abstract":"Deep Neural Networks (DNNs) surpass the human-level performance on specific tasks. The outperforming capability accelerate an adoption of DNNs to safety-critical applications such as autonomous vehicles and medical diagnosis. Millions of parameters in DNN requires a high memory capacity. A process technology scaling allows increasing memory density, however, the memory reliability confronts significant reliability issues causing errors in the memory. This can make stored weights in memory erroneous. Studies show that the erroneous weights can cause a significant accuracy loss. This motivates research on fault-tolerant DNN architectures. Despite of these efforts, DNNs are still vulnerable to errors, especially error in DNN classifier. In the worst case, because a classifier in convolutional neural network (CNN) is the last stage determining an input class, a single error in the classifier can cause a significant accuracy drop. To enhance the fault tolerance in CNN, this paper proposes a novel bipolar vector classifier which can be easily integrated with any CNN structures and can be incorporated with other fault tolerance approaches. Experimental results show that the proposed method stably maintains an accuracy with a high bit error rate up to 10−3 in the classifier.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"510 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124467887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Diansen Sun, Y. Chai, Chao Liu, Wei-Zen Sun, Qingpeng Zhang
{"title":"R2B","authors":"Diansen Sun, Y. Chai, Chao Liu, Wei-Zen Sun, Qingpeng Zhang","doi":"10.1145/3489517.3530521","DOIUrl":"https://doi.org/10.1145/3489517.3530521","url":null,"abstract":"Big data applications have differentiated requirements for I/O resources in cloud environments. For instance, data analytic and AI/ML applications usually have periodical burst I/O traffic, and data stream processing and database applications often introduce fluctuating I/O loads based on a guaranteed I/O bandwidth. However, the existing resource isolation model (i.e., RLW) and methods (e.g., Token-bucket, mClock, and cgroup) cannot support the fluctuating I/O load and differentiated I/O demands well, and thus cannot achieve fairness, high resource utilization, and high performance for applications at the same time. In this paper, we propose a novel efficient and fair I/O resource isolation model and method called R2B, which can adapt to the differentiated I/O characteristics and requirements of different applications in a shared resource environment. R2B can simultaneously satisfy the fairness and achieve both high application efficiency and high bandwidth utilization. This work aims to help the cloud provider achieve higher utilization by shifting the burden to the cloud customers to specify their type of workload.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121152175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nameun Kang, Hyungjun Kim, Hyunmyung Oh, Jae-Joon Kim
{"title":"TAIM","authors":"Nameun Kang, Hyungjun Kim, Hyunmyung Oh, Jae-Joon Kim","doi":"10.1145/3489517.3530574","DOIUrl":"https://doi.org/10.1145/3489517.3530574","url":null,"abstract":"Recently, various in-memory computing accelerators for low precision neural networks have been proposed. While in-memory Binary Neural Network (BNN) accelerators achieved significant energy efficiency, BNNs show severe accuracy degradation compared to their full precision counterpart models. To mitigate the problem, we propose TAIM, an in-memory computing hardware that can support ternary activation with negligible hardware overhead. In TAIM, a 6T SRAM cell can compute the multiplication between ternary activation and binary weight. Since the 6T SRAM cell consumes no energy when the input activation is 0, the proposed TAIM hardware can achieve even higher energy efficiency compared to BNN case by exploiting input 0's. We fabricated the proposed TAIM hardware in 28nm CMOS process and evaluated the energy efficiency on various image classification benchmarks. The experimental results show that the proposed TAIM hardware can achieve ~ 3.61× higher energy efficiency on average compared to previous designs which support ternary activation.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124121481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ran Wei, Zhe Jiang, Xiaoran Guo, Haitao Mei, Athanasios Zolotas, T. Kelly
{"title":"Designing critical systems with iterative automated safety analysis","authors":"Ran Wei, Zhe Jiang, Xiaoran Guo, Haitao Mei, Athanasios Zolotas, T. Kelly","doi":"10.1145/3489517.3530434","DOIUrl":"https://doi.org/10.1145/3489517.3530434","url":null,"abstract":"Safety analysis is an important aspect in Safety-Critical Systems Engineering (SCSE) to discover design problems that can potentially lead to hazards and eventually, accidents. Performing safety analysis requires significant manual effort --- its automation has become the research focus in the critical system domain due to the increasing complexity of systems and emergence of open adaptive systems. In this paper, we present a methodology, in which automated safety analysis drives the design of safety-critical systems. We discuss our approach with its tool support and evaluate its applicability. We briefly discuss how our approach fits into current practice of SCSE.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"207 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115562115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiangjun Peng, Ming-Chang Yang, Ho Ming Tsui, Chi Ngai Leung, Wang Kang
{"title":"SMART: on simultaneously marching racetracks to improve the performance of racetrack-based main memory","authors":"Xiangjun Peng, Ming-Chang Yang, Ho Ming Tsui, Chi Ngai Leung, Wang Kang","doi":"10.1145/3489517.3530538","DOIUrl":"https://doi.org/10.1145/3489517.3530538","url":null,"abstract":"RaceTrack Memory (RTM) is a promising media for modern Main Memory subsystems. However, the \"shift-before-access\" principle, as the nature of RTM, introduces considerable overheads to the access latency. To obtain more insights for the mitigation of shift overheads, this work characterizes and observes that the access patterns, exhibited by the state-of-the-art RTM-based Main Memory, mismatches with the granularity of shift commands (i.e., a group of RaceTracks called Domain Block Cluster (DBC)). Based on the characterization, we propose a novel mechanism called SMART, which simultaneously and proactively marches all DBCs within a subarray, so that subsequent accesses to other DBCs can be served without additional shift commands. Evaluation results show that, averaged across 15 real-world workloads, SMART significantly outperforms other state-of-the-art proposals of RTM-based Main Memory by at least 1.53X in terms of the total execution time, on two different generations of RTM technologies.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127499542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yujuan Tan, Wei Chen, Zhulin Ma, Dan Xiao, Zhichao Yan, Duo Liu, Xianzhang Chen
{"title":"SAPredictor","authors":"Yujuan Tan, Wei Chen, Zhulin Ma, Dan Xiao, Zhichao Yan, Duo Liu, Xianzhang Chen","doi":"10.1145/3489517.3530539","DOIUrl":"https://doi.org/10.1145/3489517.3530539","url":null,"abstract":"In a hybrid memory system using DRAM as the NVM cache, DRAM and NVM can be accessed in serial or parallel mode. However, we found that using either mode alone will bring access latency and bandwidth problems. In this paper, we integrate these two access modes and design a simple but accurate predictor (called SAPredictor) to help choose the appropriate access mode, thereby avoiding long access latency and bandwidth problems to improve memory performance. Our experiments show that SAPredictor achieves an accuracy rate of up to 97.1% and helps reduce access latency by up to 35.6% at fairly low costs.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114293927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AVATAR: an aging- and variation-aware dynamic timing analyzer for application-based DVAFS","authors":"Zuodong Zhang, Zizheng Guo, Yibo Lin, Runsheng Wang, Ru Huang","doi":"10.1145/3489517.3530530","DOIUrl":"https://doi.org/10.1145/3489517.3530530","url":null,"abstract":"As the timing guardband continues to increase with the continuous technology scaling, better-than-worst-case (BTWC) design has gained more and more attention. BTWC design can improve energy efficiency and/or performance by relaxing the conservative static timing constraints and exploiting the dynamic timing margin. However, to avoid potential reliability hazards, the existing dynamic timing analysis (DTA) tools have to add extra aging and variation guardbands, which are estimated under the worst-case corners of aging and variation. Such guardbanding method introduces unnecessary margin in timing analysis, thus reducing the performance and efficiency gains of BTWC designs. Therefore, in this paper, we propose AVATAR, an aging- and variation-aware dynamic timing analyzer that can perform DTA with the impact of transistor aging and random process variation. We also propose an application-based dynamic-voltage-accuracy-frequency-scaling (DVAFS) design flow based on AVATAR, which can improve energy efficiency by exploiting both dynamic timing slack (DTS) and the intrinsic error tolerance of the application. The results show that a 45.8% performance improvement and 68% power savings can be achieved by exploiting the intrinsic error tolerance. Compared with the conventional flow based on the corner-based DTA, the additional performance improvement of the proposed flow can be up to 14% or the additional power-saving can be up to 20%.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121904029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}