2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS)最新文献_第9页

Software/Hardware Co-design for Multi-modal Multi-task Learning in Autonomous Systems 自主系统中多模态多任务学习的软硬件协同设计

2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2021-04-08 DOI: 10.1109/AICAS51828.2021.9458577

Cong Hao, Deming Chen

{"title":"Software/Hardware Co-design for Multi-modal Multi-task Learning in Autonomous Systems","authors":"Cong Hao, Deming Chen","doi":"10.1109/AICAS51828.2021.9458577","DOIUrl":"https://doi.org/10.1109/AICAS51828.2021.9458577","url":null,"abstract":"Optimizing the quality of result (QoR) and the quality of service (QoS) of AI-empowered autonomous systems simultaneously is very challenging. First, there are multiple input sources, e.g., multimodal data from different sensors, requiring diverse data preprocessing, sensor fusion, and feature aggregation. Second, there are multiple tasks that require various AI models to run simultaneously, e.g., perception, localization, and control. Third, the computing and control system is heterogeneous, composed of hardware components with varied features, such as embedded CPUs, GPUs, FPGAs, and dedicated accelerators. Therefore, autonomous systems essentially require multi-modal multitask (MMMT) learning which must be aware of hardware performance and implementation strategies. While MMMT learning has been attracting intensive research interests, its applications in autonomous systems are still underexplored. In this paper, we first discuss the opportunities of applying MMMT techniques in autonomous systems, and then discuss the unique challenges that must be solved. In addition, we discuss the necessity and opportunities of MMMT model and hardware co-design, which is critical for autonomous systems especially with power/resource-limited or heterogeneous platforms. We formulate the MMMT model and heterogeneous hardware implementation co-design as a differentiable optimization problem, with the objective of improving the solution quality and reducing the overall power consumption and critical path latency. We advocate for further explorations of MMMT in autonomous systems and software/hardware co-design solutions.","PeriodicalId":173204,"journal":{"name":"2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130787018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

A Flexible and Fast PyTorch Toolkit for Simulating Training and Inference on Analog Crossbar Arrays 一个灵活和快速的PyTorch工具包，用于模拟交叉棒阵列的训练和推理

2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2021-04-05 DOI: 10.1109/AICAS51828.2021.9458494

M. Rasch, Diego Moreda, T. Gokmen, M. L. Gallo, F. Carta, Cindy Goldberg, Kaoutar El Maghraoui, A. Sebastian, V. Narayanan

{"title":"A Flexible and Fast PyTorch Toolkit for Simulating Training and Inference on Analog Crossbar Arrays","authors":"M. Rasch, Diego Moreda, T. Gokmen, M. L. Gallo, F. Carta, Cindy Goldberg, Kaoutar El Maghraoui, A. Sebastian, V. Narayanan","doi":"10.1109/AICAS51828.2021.9458494","DOIUrl":"https://doi.org/10.1109/AICAS51828.2021.9458494","url":null,"abstract":"We introduce the IBM ANALOG HARDWARE ACCELERATION KIT, a new and first of a kind open source toolkit to simulate analog crossbar arrays in a convenient fashion from within PYTORCH (freely available at https://github.com/IBM/aihwkit). The toolkit is under active development and is centered around the concept of an “analog tile” which captures the computations performed on a crossbar array. Analog tiles are building blocks that can be used to extend existing network modules with analog components and compose arbitrary artificial neural networks (ANNs) using the flexibility of the PYTORCH framework. Analog tiles can be conveniently configured to emulate a plethora of different analog hardware characteristics and their non-idealities, such as device-to-device and cycle-to-cycle variations, resistive device response curves, and weight and output noise. Additionally, the toolkit makes it possible to design custom unit cell configurations and to use advanced analog optimization algorithms such as Tiki-Taka. Moreover, the backward and update behavior can be set to “ideal\" to enable hardware-aware training features for chips that target inference acceleration only. To evaluate the inference accuracy of such chips over time, we provide statistical programming noise and drift models calibrated on phase-change memory hardware. Our new toolkit is fully GPU accelerated and can be used to conveniently estimate the impact of material properties and non-idealities of future analog technology on the accuracy for arbitrary ANNs.","PeriodicalId":173204,"journal":{"name":"2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122584372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 50

An Energy-Efficient Quad-Camera Visual System for Autonomous Machines on FPGA Platform 基于FPGA平台的自主机器节能四摄像头视觉系统

2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2021-04-01 DOI: 10.1109/AICAS51828.2021.9458486

Zishen Wan, Yuyang Zhang, A. Raychowdhury, Bo Yu, Yanjun Zhang, Shaoshan Liu

引用次数: 10

ECG-TCN: Wearable Cardiac Arrhythmia Detection with a Temporal Convolutional Network ECG-TCN:基于时间卷积网络的可穿戴心律失常检测

2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2021-03-25 DOI: 10.1109/AICAS51828.2021.9458520

Thorir Mar Ingolfsson, Xiaying Wang, Michael Hersche, A. Burrello, L. Cavigelli, L. Benini

{"title":"ECG-TCN: Wearable Cardiac Arrhythmia Detection with a Temporal Convolutional Network","authors":"Thorir Mar Ingolfsson, Xiaying Wang, Michael Hersche, A. Burrello, L. Cavigelli, L. Benini","doi":"10.1109/AICAS51828.2021.9458520","DOIUrl":"https://doi.org/10.1109/AICAS51828.2021.9458520","url":null,"abstract":"Personalized ubiquitous healthcare solutions require energy-efficient wearable platforms that provide an accurate classification of bio-signals while consuming low average power for long-term battery-operated use. Single lead electrocardiogram (ECG) signals provide the ability to detect, classify, and even predict cardiac arrhythmia. In this paper we propose a novel temporal convolutional network (TCN) that achieves high accuracy while still being feasible for wearable platform use. Experimental results on the ECG5000 dataset show that the TCN has a similar accuracy (94.2%) score as the state-of-the-art (SoA) network while achieving an improvement of 16.5% in the balanced accuracy score. This accurate classification is done with $27 times$ fewer parameters and $37 times$ less multiply-accumulate operations. We test our implementation on two publicly available platforms, the STM32L475, which is based on ARM Cortex M4F, and the GreenWaves Technologies GAP8 on the GAPuino board, based on $1 +8$ RISC-V CV32E40P cores. Measurements show that the GAP8 implementation respects the real-time constraints while consuming 0.10mJ per inference. With 9.91GMAC/s/W, it is $23.0 times$ more energy-efficient and $46.85 times$ faster than an implementation on the ARM Cortex M4F (0.43GMAC/s/W). Overall, we obtain 8.1% higher accuracy while consuming $19.6times$ less energy and being $35.1 times$ faster compared to a previous SoA embedded implementation.","PeriodicalId":173204,"journal":{"name":"2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133286121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20