Philip Colangelo, Oren Segal, Alexander Speicher, M. Margala
{"title":"AutoML for Multilayer Perceptron and FPGA Co-design","authors":"Philip Colangelo, Oren Segal, Alexander Speicher, M. Margala","doi":"10.1109/socc49529.2020.9524785","DOIUrl":"https://doi.org/10.1109/socc49529.2020.9524785","url":null,"abstract":"Optimizing neural network architectures (NNA) is a difficult process in part because of the vast number of hyperparameter combinations that exist. The difficulty in designing performant neural networks has brought a recent surge in interest in the automatic design and optimization of neural networks. The focus of the existing body of research has been on optimizing NNA for accuracy [1] [2] with publications starting to address hardware optimizations [3]. Our focus is to close this gap by using evolutionary algorithms to search an entire design space, including NNA and reconfigurable hardware. Large data-centric companies such as Facebook[4] [5] and Google [6] have published data showing that MLP workloads are the majority of their application base. Facebook cites the use of MLP for tasks such as determining which ads to display, which stories matter to see in a news feed, and which results to present from a search. Park et al. stress the importance of these networks and the current limitations on standard hardware and the call for what this research aims to solve, i.e., software and hardware co-design in [7]. Our research aims to take advantage of the reconfigurable architecture of an FPGA device that is capable of molding to a specific workload and neural network structure. Leveraging evolutionary algorithms to search the entire design space of both MLP and target hardware simultaneously, we find unique solutions that achieve both top accuracy and optimal hardware performance.","PeriodicalId":114740,"journal":{"name":"2020 IEEE 33rd International System-on-Chip Conference (SOCC)","volume":"06 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129984640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ASIC Power Estimation Across Revisions using Machine Learning","authors":"Ali Tariq, Howard Yang","doi":"10.1109/socc49529.2020.9524795","DOIUrl":"https://doi.org/10.1109/socc49529.2020.9524795","url":null,"abstract":"ASIC chip revisions often include major changes, such as new features, timing updates, and bug fixes. It is important to be able to accurately estimate dynamic and leakage power for these changes, during the architectural planning stage. Using physical design data from prior revisions, we can train machine learning models that can predict standard cell power within 15% to 40% of the post-route implementation for the new ASIC. We also look at multiple different machine learning frameworks to find the optimal solution for this problem.","PeriodicalId":114740,"journal":{"name":"2020 IEEE 33rd International System-on-Chip Conference (SOCC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126580623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cycle-to-cycle Variation Enabled Energy Efficient Privacy Preserving Technology in ANN","authors":"Jingyan Fu, Zhiheng Liao, Jinhui Wang","doi":"10.1109/socc49529.2020.9524794","DOIUrl":"https://doi.org/10.1109/socc49529.2020.9524794","url":null,"abstract":"Differential privacy is emerging as an effective solution to achieve privacy protection for the Artificial Intelligence neural network (ANN). However, not only matrix calculations of a neural network but also random noise injection mechanisms for differential privacy consume large power and resources. Traditionally, most privacy protection technologies are software technologies using von Neumann architecture and hardware with extra noise generation circuit unit. In this paper, a memristor based crossbar in-memory computing system is proposed to enable energy efficient privacy preserving technology in ANN. We utilize inherent cycle-to-cycle variations of memristors and apply the proposed variation-based pulse pair method during the weight update process. As a result, the proposed methods realize a machine learning system with privacy protection and show up to 29.24% recognition accuracy improvement with various privacy budget ε.","PeriodicalId":114740,"journal":{"name":"2020 IEEE 33rd International System-on-Chip Conference (SOCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133548624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analog Content Addressable Memory using Ferroelectric: A Case Study of Search-in-Memory","authors":"Chuangtao Chen, Qingrong Huang, Chao Li, Li Zhang, Cheng Zhuo, Xunzhao Yin","doi":"10.1109/socc49529.2020.9524766","DOIUrl":"https://doi.org/10.1109/socc49529.2020.9524766","url":null,"abstract":"Non-volatile (NV) devices are actively considered for compact and high performance memory architectures, especially in-memory computing (IMC) designs where processing and memory elements are co-located to address the memory wall issues for data-intensive applications. Content addressable memories (CAMs) are a form of IMC that compares the input query data against the stored data in parallel, and outputs the comparison result in terms of match or mismatch. Numerous CAMs have been proposed based on NV devices and demonstrate superior area, energy and performance metrics over the CMOS based conventional ones. Unlike the prior works that exploit the NV devices in the digital domain, in this paper, we proposed an analog CAM design, which utilizes the analog characteristics of Ferroelectrics field effect transistor (FeFET) to achieve a denser storage and search operations in analog domain. We illustrate our proposed analog CAM through a device-circuit co-design approach, and validate the 3-bit storage and search capability of the proposed design. The scalability of the proposed design is also examined. Evaluation results suggests that our analog CAM can achieve 22.4 × higher memory density, and 8.6 × higher energy efficiency compared with the conventional CMOS based design.","PeriodicalId":114740,"journal":{"name":"2020 IEEE 33rd International System-on-Chip Conference (SOCC)","volume":"213 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131319777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Downlink-Centric User Scheduling for Full-Duplex MU-MIMO Systems","authors":"Jianhua Zhang, M. Zou, Lai Wei, Meng Ma, B. Jiao","doi":"10.1109/socc49529.2020.9524789","DOIUrl":"https://doi.org/10.1109/socc49529.2020.9524789","url":null,"abstract":"We consider a full-duplex (FD) multiuser multiple-input multiple-output (MU-MIMO) system, where a FD base station (BS) with multiple antennas serves multiple half-duplex (HD) user equipments (UEs) in both uplink (UL) and downlink (DL) via the same time-frequency resources. UE scheduling is in demand to manage the UL-to-DL interference (UDI) incurred by the FD operation. Existing scheduling algorithms require UDI channel state information between each pair of the candidate UEs, which incurs a significant amount of overhead as the number of UEs grows. To reduce the overhead, we utilize channel reciprocity and UL-DL duality to design a DL-centric scheduling. By selecting UEs only based on their DL channels and received UDI strength, the proposed scheme no longer requires the massive UDI CSI. Numerical results demonstrate the proposed algorithm can achieve a near optimal performance without knowing any UDI CSI.","PeriodicalId":114740,"journal":{"name":"2020 IEEE 33rd International System-on-Chip Conference (SOCC)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116880789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cube Attack on a Trojan-Compromised Hardware Implementation of Ascon","authors":"Basel Halak, Jorge Duarte-Sanchez","doi":"10.1109/socc49529.2020.9524771","DOIUrl":"https://doi.org/10.1109/socc49529.2020.9524771","url":null,"abstract":"Ascon algorithm was selected in 2019, in the CAESAR competition as the first option for lightweight applications as an alternative to AES-GCM for authenticated encryption. As with other encryption algorithms, Ascon relies on some parameters and security assumptions to guarantee its security. For example, if the number of rounds of the initialization phase of the encryption is reduced, the key can be obtained using a cube attack. In this work we describe how by inserting a hardware trojan with low overhead in a hardware implementation of Ascon, it is possible to reduce the number of rounds of its initialization stage and perform a cube attack in order to obtain the key in 94 seconds on average.","PeriodicalId":114740,"journal":{"name":"2020 IEEE 33rd International System-on-Chip Conference (SOCC)","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126047609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rohini J. Gillela, A. Ganguly, D. Patru, Mark A. Indovina
{"title":"The IANET Hardware Accelerator for Audio and Visual Data Classification","authors":"Rohini J. Gillela, A. Ganguly, D. Patru, Mark A. Indovina","doi":"10.1109/socc49529.2020.9524782","DOIUrl":"https://doi.org/10.1109/socc49529.2020.9524782","url":null,"abstract":"There are several instances during driving where audible data is of importance, though often ignored. Today's deaf or acoustically impaired drivers face challenges during driving in various countries. They are vulnerable as they can't hear the siren or vehicle horn and depend on other drivers around them to act. Processing audio and providing feedback would be equally valuable to any driver or autonomous vehicle. This paper addresses the gap in existing technology by integrating audio or acoustic and image or visual processing units with the help of efficient hardware design and architecture of the Convolutional Neural Networks (CNNs). These processing units are integrated into a single module, IANET, that makes use of two CNN accelerators, one for audio and the other for image processing units. The hardware is implemented in various fixed-point representations to observe the accuracy and stability of network classifiers at each representation. The hardware accelerators for image and audio classification achieve a throughput of 30 frames per second (fps) at 180 MHz and 1 fps at 20 MHz, respectively. This paper presents the power and area-efficient hardware implementation of IANET.","PeriodicalId":114740,"journal":{"name":"2020 IEEE 33rd International System-on-Chip Conference (SOCC)","volume":"205 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122779837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chia-Chun Lin, Kit Seng Tam, Chana-Cheng Ko, Hsin-Ping Yen, Shenz-Hsiu Wei, Yung-Chih Chen, Chun-Yao Wang
{"title":"A Dynamic Expansion Order Algorithm for the SAT-based Minimization","authors":"Chia-Chun Lin, Kit Seng Tam, Chana-Cheng Ko, Hsin-Ping Yen, Shenz-Hsiu Wei, Yung-Chih Chen, Chun-Yao Wang","doi":"10.1109/socc49529.2020.9524758","DOIUrl":"https://doi.org/10.1109/socc49529.2020.9524758","url":null,"abstract":"Logic minimization attracted much attention in the early days because it is the engine for logic synthesis and optimization. Recently, a previous work proposed a SAT-based minimization algorithm for the patch function in the Engineering Change Order (ECO) problem. However, the algorithm is time-consuming for the functions in high dimension Boolean space. Therefore, in this paper, we propose an efficient algorithm that is suitable for the functions in high dimension Boolean space.","PeriodicalId":114740,"journal":{"name":"2020 IEEE 33rd International System-on-Chip Conference (SOCC)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122109606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings 33rd IEEE International System on Chip Conference (SOCC) [Front matter]","authors":"","doi":"10.1109/socc49529.2020.9524723","DOIUrl":"https://doi.org/10.1109/socc49529.2020.9524723","url":null,"abstract":"","PeriodicalId":114740,"journal":{"name":"2020 IEEE 33rd International System-on-Chip Conference (SOCC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132502410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Po-Tsang Huang, Tzung-Han Tsai, Po-Jen Yang, W. Hwang, Hung-Ming Chen
{"title":"Hierarchical Active Voltage Regulation for Heterogeneous TSV 3D-ICs","authors":"Po-Tsang Huang, Tzung-Han Tsai, Po-Jen Yang, W. Hwang, Hung-Ming Chen","doi":"10.1109/socc49529.2020.9524797","DOIUrl":"https://doi.org/10.1109/socc49529.2020.9524797","url":null,"abstract":"Among different system-in-package (SiP) technologies, through-silicon-via (TSV) 3D-IC is the key to the success of future heterogeneous SiP integration due to the high interconnect density. In heterogeneous TSV 3D integration, however, the increasing current density through both package and TSVs would lead to a large simultaneous switching noise (SSN) potentially. In this paper, a 3D power network with hierarchical active voltage regulation is proposed to reduce dynamic noises for heterogeneous TSV 3D-ICs. For the hierarchical active voltage regulation, the global power network and the local power networks are decoupled by fully-integrated voltage regulators (FIVRs). Furthermore, active switched decoupling capacitors (DECAPs) and distributed FIVRs are adopted as the global regulator and local regulators, respectively. Additionally, a substrate noise suppression technique is also presented to enhance the power integrity by reducing both substrate and TSV coupling noises. These techniques achieve not only for reducing the required DECAPs but providing flexible power sources. The modeling and simulation results of a heterogeneous TSV 3D integration demonstrate that the noise reduction on power supply pairs (VDD & GND) are suppressed by up to 71.10% with only 1.11% power overhead.","PeriodicalId":114740,"journal":{"name":"2020 IEEE 33rd International System-on-Chip Conference (SOCC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125927021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}