Andrea Damiani, Emanuele Del Sozzo, M. Santambrogio
{"title":"Large Forests and Where to “Partially” Fit Them","authors":"Andrea Damiani, Emanuele Del Sozzo, M. Santambrogio","doi":"10.1109/ASP-DAC52403.2022.9712534","DOIUrl":"https://doi.org/10.1109/ASP-DAC52403.2022.9712534","url":null,"abstract":"The Artificial Intelligence of Things (AIoT) calls for on-site Machine Learning inference to overcome the instability in latency and availability of networks. Thus, hardware acceleration is paramount for reaching the Cloud's modeling performance within an embedded device's resources. In this paper, we propose Entree, the first automatic design flow for deploying the inference of Decision Tree (DT) ensembles over Field-Programmable Gate Arrays (FPGAs) at the network's edge. It exploits dynamic partial reconfiguration on modern FPGA-enabled Systems-on-a-Chip (SoCs) to accelerate arbitrarily large DT ensembles at a latency a hundred times stabler than software alternatives. Plus, given Entree's suitability for both hardware designers and non-hardware-savvy developers, we believe it has the potential of helping data scientists to develop a non-Cloud-centric AIoT.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"101 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114011586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving the Quality of Hardware Accelerators through automatic Behavioral Input Language Conversion in HLS","authors":"M. I. Rashid, Benjamin Carrión Schäfer","doi":"10.1109/asp-dac52403.2022.9712582","DOIUrl":"https://doi.org/10.1109/asp-dac52403.2022.9712582","url":null,"abstract":"High-Level Synthesis (HLS) is now part of most standard VLSI design flows and there are numerous commercial HLS tools available. One persistent problem of HLS is that the quality of results (QoR) still heavily depends on minor things like how the code is written. One additional observation that we have made in this work is that the input language used for the same HLS tool affects the QoR. HLS tools (commercial and academic) are built in a modular way which typically include a separate front-end (parser) for each input language supported. These front-ends parse the untimed behavioral descriptions, perform numerous technology independent optimizations and output a common intermediate representations (IR) for all different input languages supported. These optimizations also heavily depend on the synthesis directives set by the designer. These directives in the form of pragmas allow to control how to synthesize arrays (register or RAM), loops (unroll or not or pipeline) and functions (inline or not). We have observed that two functional equivalent behavioral descriptions with the same set of synthesis directives often lead to circuits with different QoR for the same HLS tool. Thus, automated approaches are needed to help designers to generate the best possible circuit independently of the input language used. To address this, in this work we propose using Graph Convolutional Networks (GCN) to determine the best language for a given new behavioral description and present an automated language converter for HLS.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125106920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Graph Neural Network Method for Fast ECO Leakage Power Optimization","authors":"Kai Wang, Peng Cao","doi":"10.1109/ASP-DAC52403.2022.9712486","DOIUrl":"https://doi.org/10.1109/ASP-DAC52403.2022.9712486","url":null,"abstract":"In modern design, engineering change order (ECO) is often utilized to perform power optimization including gate-sizing and Vth-assignments, which is efficient but highly timing consuming. Many graph neural network (GNN) based methods are recently proposed for fast and accurate ECO power optimization by considering neighbors' information. Nonetheless, these works fail to learn high-quality node representations on directed graph since they treat all neighbors uniformly when gathering their information and lack local topology information from neighbors one or two-hop away. In this paper, we introduce a directed GNN based method which learns information from different neighbors respectively and contains rich local topology information, which was validated by the Opencores and IWLS 2005 benchmarks with TSMC 28nm technology. Experimental results show that our approach outperforms prior GNN based methods with at least 7.8% and 7.6% prediction accuracy improvement for seen and unseen designs respectively as well as 8.3% to 29.0% leakage optimization improvement. Compared with commercial EDA tool PrimeTime, the proposed framework achieves similar power optimization results with up to 12X runtime improvement.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125946235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonas Krautter, M. Mayahinia, Dennis R. E. Gnad, M. Tahoori
{"title":"Data Leakage through Self-Terminated Write Schemes in Memristive Caches","authors":"Jonas Krautter, M. Mayahinia, Dennis R. E. Gnad, M. Tahoori","doi":"10.1109/asp-dac52403.2022.9712492","DOIUrl":"https://doi.org/10.1109/asp-dac52403.2022.9712492","url":null,"abstract":"Memory cells in emerging non-volatile resistive memories often have asymmetric switching properties, where reliable write operations are achieved by setting the write period to a fixed value. To improve their performance and energy efficiency, self-terminating write schemes have been proposed, in which the write signal is stopped after the required state change has been observed. In this work, we show how this data-dependent write latency can be exploited as a side-channel in multiple ways to unveil restricted memory content. Moreover, we discuss and evaluate potential approaches to address the issue.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125970003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Fink, Philipp Ebner, Sudip Poddar, R. Wille, J. Kepler
{"title":"Improving the Robustness of Microfluidic Networks","authors":"G. Fink, Philipp Ebner, Sudip Poddar, R. Wille, J. Kepler","doi":"10.1109/asp-dac52403.2022.9712527","DOIUrl":"https://doi.org/10.1109/asp-dac52403.2022.9712527","url":null,"abstract":"Microfluidic devices, often in the form of Lab-on-a-Chip (LoCs), are successfully utilized in many domains such as medicine, chemistry, biology, etc. However, neither the fabrication process nor the respectively used materials are perfect and, thus, defects are frequently induced into the actual physical realization of the device. This is especially critical for sensitive devices such as droplet-based microfluidic networks that are able to route droplets inside channels along different paths by only exploiting passive hydrodynamic effects. However, these passive hydrodynamic effects are very sensitive and already slight changes of parameters (e.g., in the channel width) can alter the behavior, even in such a way that the intended functionality of the network breaks. Hence, it is important that microfluidic networks become robust against such defects in order to prevent erroneous behavior. But considering such defects during the design process is a non-trivial task and, therefore, designers mostly neglected such considerations thus far. To overcome this problem, we propose a robustness improvement process that allows to optimize an initial design in such a way that it becomes more robust against defects (while still retaining the original behavior of the initial design). To this end, we first utilize a metric to compare the robustness of different designs and, afterwards, discuss methods that aim to improve the robustness. The metric and methods are demonstrated by an example and also tested on several networks to show the validity of the robustness improvement process.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129879657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Viability of Decision Trees for Learning Models of Systems","authors":"Swantje Plambeck, Lutz Schammer, Görschwin Fey","doi":"10.1109/asp-dac52403.2022.9712579","DOIUrl":"https://doi.org/10.1109/asp-dac52403.2022.9712579","url":null,"abstract":"Abstract models of embedded systems are useful for various tasks, ranging from diagnosis, through testing to monitoring at run-time. However, deriving a model for an unknown system is difficult. Generic learners like decision trees can identify specific properties of systems and have been applied successfully, e.g., for anomaly detection and test case identification. We consider Decision Tree Learning (DTL) to derive a new type of model from given observations with bounded history for systems that have a Mealy machine representation. We prove theoretical limitations and evaluate the practical characteristics in an experimental validation.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133277334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Transient Adjoint DAE Sensitivities: a Complete, Rigorous, and Numerically Accurate Formulation","authors":"Naomi Sagan, J. Roychowdhury","doi":"10.1109/asp-dac52403.2022.9712537","DOIUrl":"https://doi.org/10.1109/asp-dac52403.2022.9712537","url":null,"abstract":"Almost all practical systems rely heavily on physical parameters. As a result, parameter sensitivity, or the extent to which perturbations in parameter values affect the state of a system, is intrinsically connected to system design and optimization. We present TADsens, a method for computing the parameter sensitivities of an output of a differential algebraic equation (DAE) system. Specifically, we provide rigorous, insightful theory for adjoint sensitivity computation of DAEs, along with an efficient and numerically well-posed algorithm implemented in Berkeley MAPP. Our theory and implementation advances resolve longstanding issues that have impeded adoption of adjoint transient sensitivities in circuit simulators for over 5 decades. We present results and comparisons on two nonlinear analog circuits. TADsens is numerically well posed and accurate, and faster by a factor of 300 over direct sensitivity computation on a circuit with over 150 unknowns and 600 parameters.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134117592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Accuracy Reconfigurable Vector Accelerator Based on Approximate Logarithmic Multipliers","authors":"Lingxia Hou, Yutaka Masuda, T. Ishihara","doi":"10.1109/ASP-DAC52403.2022.9712504","DOIUrl":"https://doi.org/10.1109/ASP-DAC52403.2022.9712504","url":null,"abstract":"The logarithmic approximate multiplier proposed by Mitchell provides an efficient alternative to accurate multipliers in terms of area and power consumption. However, its maximum error of 11.1% makes it difficult to deploy in applications requiring high accuracy. To widely reduce the error of the Mitchell multiplier, this paper proposes a novel operand decomposition method which decomposes one operand into multiple operands and calculates them using multiple Mitchell multipliers. Based on this operand decomposition, this paper also proposes an accuracy reconfigurable vector accelerator which can provide a required computational accuracy with a high parallelism. The proposed vector accelerator dramatically reduces the area by more than half from the accurate multiplier array while satisfying the required accuracy for various applications. The experimental results show that our proposed vector accelerator behaves well in image processing and robot localization.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"1115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134210029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Computation-in-Memory Accelerators for Secure Graph Database: Opportunities and Challenges","authors":"Md Tanvir Arafin","doi":"10.1109/ASP-DAC52403.2022.9712502","DOIUrl":"https://doi.org/10.1109/ASP-DAC52403.2022.9712502","url":null,"abstract":"This work presents the challenges and opportunities for developing computing-in-memory (CIM) accelerators to support secure graph databases (GDB). First, we examine the database backend of common GDBs to understand the feasibility of CIM-based hardware architectures to speed up database queries. Then, we explore standard accelerator designs for graph computation. Next, we present the security issues of graph databases and survey how advanced cryptographic techniques such as homomorphic encryption and zero-knowledge protocols can execute privacy-preserving queries in a secure graph database. After that, we illustrate possible CIM architectures for integrating secure computation with GDB acceleration. Finally, we discuss the design overheads, useability, and potential challenges for building CIM-based accelerators for supporting data-centric calculations. Overall, we find that computing-in-memory primitives have the potential to play a crucial role in realizing the next generation of fast and secure graph databases.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130022478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SYNTHNET: A High-throughput yet Energy-efficient Combinational Logic Neural Network","authors":"Tianen Chen, Taylor Kemp, Younghyun Kim","doi":"10.1109/asp-dac52403.2022.9712554","DOIUrl":"https://doi.org/10.1109/asp-dac52403.2022.9712554","url":null,"abstract":"In combinational logic neural networks (CLNNs), neurons are realized as combinational logic circuits or look-up tables (LUTs). They make make extremely low-latency inference possible by performing the computation with pure hardware without loading weights from the memory. The high throughput, however, is powered by massively parallel logic circuits or LUTs and hence comes with high area occupancy and high energy consumption. We present SYNTHNET, a novel CLNN design method that effectively identifies and keeps only the sublogics that play a critical role in the accuracy and remove those which do not contribute to improving the accuracy. It captures the abundant redundancy in NNs that can be exploited only in CLNNs, and thereby dramatically reduces the energy consumption of CLNNs with minimal accuracy degradation. We prove the efficacy of SYNTHNET on the CIFAR-10 dataset, maintaining a competitive accuracy while successfully replacing layers of a VGG-style network which traditionally uses memory-based floating point operations with combinational logic. Experimental results suggest our design can reduce energy-consumption of CLNNs more than 90% compared to the state-of-the-art design.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130780403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}