2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)最新文献_第7页

Sub-Word Parallel Precision-Scalable MAC Engines for Efficient Embedded DNN Inference 用于高效嵌入式DNN推理的子字并行精确可扩展MAC引擎

2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2019-03-01 DOI: 10.1109/AICAS.2019.8771481

L. Mei, Mohit Dandekar, D. Rodopoulos, J. Constantin, P. Debacker, R. Lauwereins, M. Verhelst

{"title":"Sub-Word Parallel Precision-Scalable MAC Engines for Efficient Embedded DNN Inference","authors":"L. Mei, Mohit Dandekar, D. Rodopoulos, J. Constantin, P. Debacker, R. Lauwereins, M. Verhelst","doi":"10.1109/AICAS.2019.8771481","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771481","url":null,"abstract":"To enable energy-efficient embedded execution of Deep Neural Networks (DNNs), the critical sections of these workloads, their multiply-accumulate (MAC) operations, need to be carefully optimized. The SotA pursues this through run-time precision-scalable MAC operators, which can support the varying precision needs of DNNs in an energy-efficient way. Yet, to implement the adaptable precision MAC operation, most SotA solutions rely on separately optimized low precision multipliers and a precision-variable accumulation scheme, with the possible disadvantages of a high control complexity and degraded throughput. This paper, first optimizes one of the most effective SotA techniques to support fully-connected DNN layers. This mode, exploiting the transformation of a high precision multiplier into independent parallel low-precision multipliers, will be called the Sum Separate (SS) mode. In addition, this work suggests an alternative low-precision scheme, i.e. the implicit accumulation of multiple low precision products within the multiplier itself, called the Sum Together (ST) mode. Based on the two types of MAC arrangements explored, corresponding architectures have been proposed to implement DNN processing. The two architectures, yielding the same throughput, are compared in different working precisions (2/4/8/16-bit), based on Post-Synthesis simulation. The result shows that the proposed ST-Mode based architecture outperforms the earlier SS-Mode by up to ×1.6 on Energy Efficiency (TOPS/W) and ×1.5 on Area Efficiency (GOPS/mm2).","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133298951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

AICAS 2019 Author Index AICAS 2019作者索引

2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2019-03-01 DOI: 10.1109/aicas.2019.8771499

引用次数: 0

AIP: Saving the DRAM Access Energy of CNNs Using Approximate Inner Products AIP:利用近似内积节省cnn的DRAM存取能量

2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2019-03-01 DOI: 10.1109/AICAS.2019.8771595

C. Cheng, Ren-Shuo Liu

引用次数: 0

Heterogeneous Activation Function Extraction for Training and Optimization of SNN Systems 异构激活函数提取用于SNN系统的训练和优化

2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2019-03-01 DOI: 10.1109/AICAS.2019.8771619

A. Zjajo, Sumeet S. Kumar, R. V. Leuken

引用次数: 0

Analog Weights in ReRAM DNN Accelerators ReRAM DNN加速器中的模拟权重

2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2019-03-01 DOI: 10.1109/AICAS.2019.8771550

J. Eshraghian, S. Kang, Seungbum Baek, G. Orchard, H. Iu, W. Lei

{"title":"Analog Weights in ReRAM DNN Accelerators","authors":"J. Eshraghian, S. Kang, Seungbum Baek, G. Orchard, H. Iu, W. Lei","doi":"10.1109/AICAS.2019.8771550","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771550","url":null,"abstract":"Artificial neural networks have become ubiquitous in modern life, which has triggered the emergence of a new class of application specific integrated circuits for their acceleration. ReRAM-based accelerators have gained significant traction due to their ability to leverage in-memory computations. In a crossbar structure, they can perform multiply-and-accumulate operations more efficiently than standard CMOS logic. By virtue of being resistive switches, ReRAM switches can only reliably store one of two states. This is a severe limitation on the range of values in a computational kernel. This paper presents a novel scheme in alleviating the single-bit-per-device restriction by exploiting frequency dependence of v-i plane hysteresis, and assigning kernel information not only to the device conductance but also partially distributing it to the frequency of a time-varying input.We show this approach reduces average power consumption for a single crossbar convolution by up to a factor of ×16 for an unsigned 8-bit input image, where each convolutional process consumes a worst-case of 1.1mW, and reduces area by a factor of ×8, without reducing accuracy to the level of binarized neural networks. This presents a massive saving in computing cost when there are many simultaneous in-situ multiply-and-accumulate processes occurring across different crossbars.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114612841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 33

AnalogHTM: Memristive Spatial Pooler Learning with Backpropagation 基于反向传播的记忆空间池学习

2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2019-03-01 DOI: 10.1109/AICAS.2019.8771628

O. Krestinskaya, A. P. James

引用次数: 3

Conversion of Synchronous Artificial Neural Network to Asynchronous Spiking Neural Network using sigma-delta quantization 用sigma-delta量化将同步人工神经网络转换为异步尖峰神经网络

2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2019-03-01 DOI: 10.1109/AICAS.2019.8771624

A. Yousefzadeh, Sahar Hosseini, Priscila C. Holanda, Sam Leroux, T. Werner, T. Serrano-Gotarredona, B. Linares-Barranco, B. Dhoedt, P. Simoens

{"title":"Conversion of Synchronous Artificial Neural Network to Asynchronous Spiking Neural Network using sigma-delta quantization","authors":"A. Yousefzadeh, Sahar Hosseini, Priscila C. Holanda, Sam Leroux, T. Werner, T. Serrano-Gotarredona, B. Linares-Barranco, B. Dhoedt, P. Simoens","doi":"10.1109/AICAS.2019.8771624","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771624","url":null,"abstract":"Artificial Neural Networks (ANNs) show great performance in several data analysis tasks including visual and auditory applications. However, direct implementation of these algorithms without considering the sparsity of data requires high processing power, consume vast amounts of energy and suffer from scalability issues. Inspired by biology, one of the methods which can reduce power consumption and allow scalability in the implementation of neural networks is asynchronous processing and communication by means of action potentials, so-called spikes. In this work, we use the well-known sigma-delta quantization method and introduce an easy and straightforward solution to convert an Artificial Neural Network to a Spiking Neural Network which can be implemented asynchronously in a neuromorphic platform. Briefly, we used asynchronous spikes to communicate the quantized output activations of the neurons. Despite the fact that our proposed mechanism is simple and applicable to a wide range of different ANNs, it outperforms the state-of-the-art implementations from the accuracy and energy consumption point of view. All source code for this project is available upon request for the academic purpose1.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116581818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Special Session: 2018 Low-Power Image Recognition Challenge and Beyond 特别会议:2018低功耗图像识别挑战及超越

2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2019-03-01 DOI: 10.1109/AICAS.2019.8771606

M. Ardi, A. Berg, Bo Chen, Yen-kuang Chen, Yiran Chen, Donghyun Kang, Junhyeok Lee, Seungjae Lee, Yang Lu, Yung-Hsiang Lu, Fei Sun

引用次数: 1

NeuroPilot: A Cross-Platform Framework for Edge-AI NeuroPilot: Edge-AI跨平台框架

2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2019-03-01 DOI: 10.1109/AICAS.2019.8771536

Tung-Chien Chen, Wei-Ting Wang, Kloze Kao, Chia-Lin Yu, C. Lin, Shu-Hsin Chang, Pei-Kuei Tsung

引用次数: 8

Automatic HCC Detection Using Convolutional Network with Multi-Magnification Input Images 基于多倍放大输入图像的卷积网络肝癌自动检测

2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2019-03-01 DOI: 10.1109/AICAS.2019.8771535

Wei-Che Huang, P. Chung, H. Tsai, N. Chow, Y. Juang, H. Tsai, Shih-Hsuan Lin, Cheng-Hsiung Wang

{"title":"Automatic HCC Detection Using Convolutional Network with Multi-Magnification Input Images","authors":"Wei-Che Huang, P. Chung, H. Tsai, N. Chow, Y. Juang, H. Tsai, Shih-Hsuan Lin, Cheng-Hsiung Wang","doi":"10.1109/AICAS.2019.8771535","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771535","url":null,"abstract":"Liver cancer postoperative pathologic examination of stained tissues is an important step in identifying prognostic factors for follow-up care. Traditionally, liver cancer detection would be performed by pathologists with observing the entire biological tissue, resulting in heavy work loading and potential misjudgment. Accordingly, the studies of the automatic pathological examination have been popular for a long period of time. Most approaches of the existing cancer detection, however, only extract cell level information based on single-scale high-magnification patch. In liver tissues, common cell change phenomena such as apoptosis, necrosis, and steatosis are similar in tumor and benign. Hence, the detection may fail when the patch only covered the changed cells area that cannot provide enough neighboring cell structure information. To conquer this problem, the convolutional network architecture with multi-magnification input can provide not only the cell level information by referencing high-magnification patches, but also the cell structure information by referencing low-magnification patches. The detection algorithm consists of two main structures: 1) extraction of cell level and cell structure level feature maps from high-magnification and low-magnification images respectively by separate general convolutional networks, and 2) integration of multi-magnification features by fully connected network. In this paper, VGG16 and Inception V4 were applied as the based convolutional network for liver tumor detection task. The experimental results showed that VGG16 based multi-magnification input convolutional network achieved 91% mIOU on HCC tumor detection task. In addition, with comparison between single-scale CNN (SSCN) and multi-scale CNN (MSCN) approaches, the MSCN demonstrated that the multi-scale patches could provide better performance on HCC classification task.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123598530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13