2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)最新文献

筛选
英文 中文
Sub-Word Parallel Precision-Scalable MAC Engines for Efficient Embedded DNN Inference 用于高效嵌入式DNN推理的子字并行精确可扩展MAC引擎
L. Mei, Mohit Dandekar, D. Rodopoulos, J. Constantin, P. Debacker, R. Lauwereins, M. Verhelst
{"title":"Sub-Word Parallel Precision-Scalable MAC Engines for Efficient Embedded DNN Inference","authors":"L. Mei, Mohit Dandekar, D. Rodopoulos, J. Constantin, P. Debacker, R. Lauwereins, M. Verhelst","doi":"10.1109/AICAS.2019.8771481","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771481","url":null,"abstract":"To enable energy-efficient embedded execution of Deep Neural Networks (DNNs), the critical sections of these workloads, their multiply-accumulate (MAC) operations, need to be carefully optimized. The SotA pursues this through run-time precision-scalable MAC operators, which can support the varying precision needs of DNNs in an energy-efficient way. Yet, to implement the adaptable precision MAC operation, most SotA solutions rely on separately optimized low precision multipliers and a precision-variable accumulation scheme, with the possible disadvantages of a high control complexity and degraded throughput. This paper, first optimizes one of the most effective SotA techniques to support fully-connected DNN layers. This mode, exploiting the transformation of a high precision multiplier into independent parallel low-precision multipliers, will be called the Sum Separate (SS) mode. In addition, this work suggests an alternative low-precision scheme, i.e. the implicit accumulation of multiple low precision products within the multiplier itself, called the Sum Together (ST) mode. Based on the two types of MAC arrangements explored, corresponding architectures have been proposed to implement DNN processing. The two architectures, yielding the same throughput, are compared in different working precisions (2/4/8/16-bit), based on Post-Synthesis simulation. The result shows that the proposed ST-Mode based architecture outperforms the earlier SS-Mode by up to ×1.6 on Energy Efficiency (TOPS/W) and ×1.5 on Area Efficiency (GOPS/mm2).","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133298951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
AICAS 2019 Author Index AICAS 2019作者索引
{"title":"AICAS 2019 Author Index","authors":"","doi":"10.1109/aicas.2019.8771499","DOIUrl":"https://doi.org/10.1109/aicas.2019.8771499","url":null,"abstract":"","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133370745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AIP: Saving the DRAM Access Energy of CNNs Using Approximate Inner Products AIP:利用近似内积节省cnn的DRAM存取能量
C. Cheng, Ren-Shuo Liu
{"title":"AIP: Saving the DRAM Access Energy of CNNs Using Approximate Inner Products","authors":"C. Cheng, Ren-Shuo Liu","doi":"10.1109/AICAS.2019.8771595","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771595","url":null,"abstract":"In this work, we propose AIP (Approximate Inner Product), which approximates the inner products of CNNs’ fully-connected (FC) layers by using only a small fraction (e.g., one-sixteenth) of parameters. We observe that FC layers possess several characteristics that naturally fit AIP: the dropout training strategy, rectified linear units (ReLUs), and top-n operator. Experimental results show that 48% of DRAM access energy can be reduced at the cost of only 2% of top-5 accuracy loss (for VGG-f).","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131525190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Heterogeneous Activation Function Extraction for Training and Optimization of SNN Systems 异构激活函数提取用于SNN系统的训练和优化
A. Zjajo, Sumeet S. Kumar, R. V. Leuken
{"title":"Heterogeneous Activation Function Extraction for Training and Optimization of SNN Systems","authors":"A. Zjajo, Sumeet S. Kumar, R. V. Leuken","doi":"10.1109/AICAS.2019.8771619","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771619","url":null,"abstract":"Energy-efficiency and computation capability characteristics of analog/mixed-signal spiking neural networks offer capable platform for implementation of cognitive tasks on resource-limited embedded platforms. However, inherent mismatch in analog devices severely influence accuracy and reliability of the computing system. In this paper, we devise efficient algorithm for extracting of heterogeneous activation functions of analog hardware neurons as a set of constraints in an off-line training and optimization process, and examine how compensation of the mismatch effects influence synchronicity and information processing capabilities of the system.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123047690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analog Weights in ReRAM DNN Accelerators ReRAM DNN加速器中的模拟权重
J. Eshraghian, S. Kang, Seungbum Baek, G. Orchard, H. Iu, W. Lei
{"title":"Analog Weights in ReRAM DNN Accelerators","authors":"J. Eshraghian, S. Kang, Seungbum Baek, G. Orchard, H. Iu, W. Lei","doi":"10.1109/AICAS.2019.8771550","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771550","url":null,"abstract":"Artificial neural networks have become ubiquitous in modern life, which has triggered the emergence of a new class of application specific integrated circuits for their acceleration. ReRAM-based accelerators have gained significant traction due to their ability to leverage in-memory computations. In a crossbar structure, they can perform multiply-and-accumulate operations more efficiently than standard CMOS logic. By virtue of being resistive switches, ReRAM switches can only reliably store one of two states. This is a severe limitation on the range of values in a computational kernel. This paper presents a novel scheme in alleviating the single-bit-per-device restriction by exploiting frequency dependence of v-i plane hysteresis, and assigning kernel information not only to the device conductance but also partially distributing it to the frequency of a time-varying input.We show this approach reduces average power consumption for a single crossbar convolution by up to a factor of ×16 for an unsigned 8-bit input image, where each convolutional process consumes a worst-case of 1.1mW, and reduces area by a factor of ×8, without reducing accuracy to the level of binarized neural networks. This presents a massive saving in computing cost when there are many simultaneous in-situ multiply-and-accumulate processes occurring across different crossbars.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114612841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
AnalogHTM: Memristive Spatial Pooler Learning with Backpropagation 基于反向传播的记忆空间池学习
O. Krestinskaya, A. P. James
{"title":"AnalogHTM: Memristive Spatial Pooler Learning with Backpropagation","authors":"O. Krestinskaya, A. P. James","doi":"10.1109/AICAS.2019.8771628","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771628","url":null,"abstract":"Spatial pooler is responsible for feature extraction in Hierarchical Temporal Memory (HTM). In this paper, we present analog backpropagation learning circuits integrated to the memristive circuit design of spatial pooler. Using 0.18μm CMOS technology and TiOx memristor models, the maximum on-chip area and power consumption of the proposed design are 8335.074μm2 and 51.55mW, respectively. The system is tested for a face recognition problem AR face database achieving a recognition accuracy of 90%.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128874036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Conversion of Synchronous Artificial Neural Network to Asynchronous Spiking Neural Network using sigma-delta quantization 用sigma-delta量化将同步人工神经网络转换为异步尖峰神经网络
A. Yousefzadeh, Sahar Hosseini, Priscila C. Holanda, Sam Leroux, T. Werner, T. Serrano-Gotarredona, B. Linares-Barranco, B. Dhoedt, P. Simoens
{"title":"Conversion of Synchronous Artificial Neural Network to Asynchronous Spiking Neural Network using sigma-delta quantization","authors":"A. Yousefzadeh, Sahar Hosseini, Priscila C. Holanda, Sam Leroux, T. Werner, T. Serrano-Gotarredona, B. Linares-Barranco, B. Dhoedt, P. Simoens","doi":"10.1109/AICAS.2019.8771624","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771624","url":null,"abstract":"Artificial Neural Networks (ANNs) show great performance in several data analysis tasks including visual and auditory applications. However, direct implementation of these algorithms without considering the sparsity of data requires high processing power, consume vast amounts of energy and suffer from scalability issues. Inspired by biology, one of the methods which can reduce power consumption and allow scalability in the implementation of neural networks is asynchronous processing and communication by means of action potentials, so-called spikes. In this work, we use the well-known sigma-delta quantization method and introduce an easy and straightforward solution to convert an Artificial Neural Network to a Spiking Neural Network which can be implemented asynchronously in a neuromorphic platform. Briefly, we used asynchronous spikes to communicate the quantized output activations of the neurons. Despite the fact that our proposed mechanism is simple and applicable to a wide range of different ANNs, it outperforms the state-of-the-art implementations from the accuracy and energy consumption point of view. All source code for this project is available upon request for the academic purpose1.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116581818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Special Session: 2018 Low-Power Image Recognition Challenge and Beyond 特别会议:2018低功耗图像识别挑战及超越
M. Ardi, A. Berg, Bo Chen, Yen-kuang Chen, Yiran Chen, Donghyun Kang, Junhyeok Lee, Seungjae Lee, Yang Lu, Yung-Hsiang Lu, Fei Sun
{"title":"Special Session: 2018 Low-Power Image Recognition Challenge and Beyond","authors":"M. Ardi, A. Berg, Bo Chen, Yen-kuang Chen, Yiran Chen, Donghyun Kang, Junhyeok Lee, Seungjae Lee, Yang Lu, Yung-Hsiang Lu, Fei Sun","doi":"10.1109/AICAS.2019.8771606","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771606","url":null,"abstract":"The IEEE Low-Power Image Recognition Challenge (LPIRC) is an annual competition started in 2015. The competition identifies the best technologies that can detect objects in images efficiently (short execution time and low energy consumption). This paper summarizes LPIRC in year 2018 by describing the winners’ solutions. The paper also discusses the future of low-power computer vision.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115968802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
NeuroPilot: A Cross-Platform Framework for Edge-AI NeuroPilot: Edge-AI跨平台框架
Tung-Chien Chen, Wei-Ting Wang, Kloze Kao, Chia-Lin Yu, C. Lin, Shu-Hsin Chang, Pei-Kuei Tsung
{"title":"NeuroPilot: A Cross-Platform Framework for Edge-AI","authors":"Tung-Chien Chen, Wei-Ting Wang, Kloze Kao, Chia-Lin Yu, C. Lin, Shu-Hsin Chang, Pei-Kuei Tsung","doi":"10.1109/AICAS.2019.8771536","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771536","url":null,"abstract":"Artificial intelligence (AI) has been applied from cloud servers to edge devices because of its rapid response, privacy, robustness, and the efficient use of network bandwidth. However, it is challengeable to deploy the computation and memory-bandwidth intensive AI to edge devices for the power and hardware resource are limited. The various needs of applications, diverse devices and the fragmented supporting tools make the integration a tough work. In this paper, the NeuroPilot, a cross-platform framework for edge AI, is introduced. Technologies on software, hardware and integration levels are proposed to achieve the high performance and preserve the flexibility meanwhile. The NeuroPilot solution provides the superior edge AI ability for a wide range of applications.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"216 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121943031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Automatic HCC Detection Using Convolutional Network with Multi-Magnification Input Images 基于多倍放大输入图像的卷积网络肝癌自动检测
Wei-Che Huang, P. Chung, H. Tsai, N. Chow, Y. Juang, H. Tsai, Shih-Hsuan Lin, Cheng-Hsiung Wang
{"title":"Automatic HCC Detection Using Convolutional Network with Multi-Magnification Input Images","authors":"Wei-Che Huang, P. Chung, H. Tsai, N. Chow, Y. Juang, H. Tsai, Shih-Hsuan Lin, Cheng-Hsiung Wang","doi":"10.1109/AICAS.2019.8771535","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771535","url":null,"abstract":"Liver cancer postoperative pathologic examination of stained tissues is an important step in identifying prognostic factors for follow-up care. Traditionally, liver cancer detection would be performed by pathologists with observing the entire biological tissue, resulting in heavy work loading and potential misjudgment. Accordingly, the studies of the automatic pathological examination have been popular for a long period of time. Most approaches of the existing cancer detection, however, only extract cell level information based on single-scale high-magnification patch. In liver tissues, common cell change phenomena such as apoptosis, necrosis, and steatosis are similar in tumor and benign. Hence, the detection may fail when the patch only covered the changed cells area that cannot provide enough neighboring cell structure information. To conquer this problem, the convolutional network architecture with multi-magnification input can provide not only the cell level information by referencing high-magnification patches, but also the cell structure information by referencing low-magnification patches. The detection algorithm consists of two main structures: 1) extraction of cell level and cell structure level feature maps from high-magnification and low-magnification images respectively by separate general convolutional networks, and 2) integration of multi-magnification features by fully connected network. In this paper, VGG16 and Inception V4 were applied as the based convolutional network for liver tumor detection task. The experimental results showed that VGG16 based multi-magnification input convolutional network achieved 91% mIOU on HCC tumor detection task. In addition, with comparison between single-scale CNN (SSCN) and multi-scale CNN (MSCN) approaches, the MSCN demonstrated that the multi-scale patches could provide better performance on HCC classification task.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123598530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信