2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)最新文献

筛选
英文 中文
TRIO: a Novel 10T Ternary SRAM Cell for Area-Efficient In-memory Computing of Ternary Neural Networks 一种用于三元神经网络区域高效内存计算的新型10T三元SRAM单元
Thanh-Dat Nguyen, Minh-Son Le, Thi-Nhan Pham, I. Chang
{"title":"TRIO: a Novel 10T Ternary SRAM Cell for Area-Efficient In-memory Computing of Ternary Neural Networks","authors":"Thanh-Dat Nguyen, Minh-Son Le, Thi-Nhan Pham, I. Chang","doi":"10.1109/AICAS57966.2023.10168596","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168596","url":null,"abstract":"We introduce TRIO, a 10T SRAM cell for inmemory computing circuits in ternary neural networks (TNNs). TRIO's thin-cell type layout occupies only 0.492μm2 in a 28nm FD-SOI technology, which is smaller than some state-of-the-art ternary SRAM cells. Comparing TRIO to other works, we found that it consumes less analog multiplication power, indicating its potential for improving the area and power efficiency of TNN IMC circuits. Our optimized TNN IMC circuit using TRIO achieved high area and power efficiencies of 369.39 TOPS/mm2 and 333.8 TOPS/W in simulations.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132038767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Learning Compiler Optimization on Multi-Chiplet Architecture 基于多芯片架构的深度学习编译器优化
Huiqing Xu, Kuang Mao, Quihong Pan, Zhaorong Tang, Mengdi Wang, Ying Wang
{"title":"Deep Learning Compiler Optimization on Multi-Chiplet Architecture","authors":"Huiqing Xu, Kuang Mao, Quihong Pan, Zhaorong Tang, Mengdi Wang, Ying Wang","doi":"10.1109/AICAS57966.2023.10168656","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168656","url":null,"abstract":"Multi-chiplet architecture can provide a high-performance solution for new tasks such as deep learning models. In order to fully utilize chiplets and accelerate the execution of deep learning models, we present a deep learning compilation optimization framework for chiplets, and propose a scheduling method based on data dependence. Experiments show that our method improves the compilation efficiency, and the performance of the scheduling scheme is at least 1-2 times higher than the traditional algorithms.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132168979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NeuroBMI: A New Neuromorphic Implantable Wireless Brain Machine Interface with A 0.48 µW Event-Driven Noise-Tolerant Spike Detector neurorobmi:一种新的神经形态植入式无线脑机接口,带有0.48 μ W事件驱动的容噪峰值检测器
Jinbo Chen, Hui Wu, Xing Liu, Razieh Eskandari, Fengshi Tian, Wenjun Zou, Chaoming Fang, Jie Yang, M. Sawan
{"title":"NeuroBMI: A New Neuromorphic Implantable Wireless Brain Machine Interface with A 0.48 µW Event-Driven Noise-Tolerant Spike Detector","authors":"Jinbo Chen, Hui Wu, Xing Liu, Razieh Eskandari, Fengshi Tian, Wenjun Zou, Chaoming Fang, Jie Yang, M. Sawan","doi":"10.1109/AICAS57966.2023.10168619","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168619","url":null,"abstract":"The use of Brain-Machine Interfaces (BMIs) in neuroscience research and neural prosthetics has seen widespread application. With the technology trend shifting from wearable to implantable wireless BMIs featuring increasing channel counts, the volume of data generated requires impractically high bandwidth and transmission power for the implants. In this paper, we present NeuroBMI, a novel neuromorphic implantable wireless BMI that leverages a unified neuromorphic strategy for neural signal sampling, processing, and transmission. The proposed NeuroBMI and neuromorphic strategy reduces transmitted data rate and overall power consumption. NeuroBMI takes into account the high sparsity of neural signals by employing an integrateand-fire sampling based analog-to-spike converter (ASC), which generates digital spike trains based on triggered events and avoids unnecessary data sampling. Additionally, an event-driven noise-tolerant spike detector and event-driven spike transmitter are also proposed, to further reduce the energy consumption and transmitted data rate. Simulation results demonstrate that the proposed NeuroBMI achieves a data compression ratio of 520, with the proposed spike detector consuming only 0.48 µW.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121899956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MF-DSNN:An Energy-efficient High-performance Multiplication-free Deep Spiking Neural Network Accelerator MF-DSNN:一种高效节能的无乘法深度峰值神经网络加速器
Yue Zhang, Shuai Wang, Yi Kang
{"title":"MF-DSNN:An Energy-efficient High-performance Multiplication-free Deep Spiking Neural Network Accelerator","authors":"Yue Zhang, Shuai Wang, Yi Kang","doi":"10.1109/AICAS57966.2023.10168643","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168643","url":null,"abstract":"Inspired by the brain structure, Spiking Neural Networks (SNNs) are computing models communicating and calculating through spikes. SNNs that are well-trained demonstrate high sparsity in both weight and activation, distributed spatially and temporally. This sparsity presents both opportunities and challenges for high energy efficiency inference computing of SNNs when compared to conventional artificial neural networks (ANNs). Specifically, the high sparsity can significantly reduce inference delay and energy consumption. However, the temporal dimension greatly complicates the design of spiking accelerators. In this paper, we propose a unique solution for sparse spiking neural network acceleration. First, we adopt a temporal coding scheme called FS coding which differs from the rate coding used in traditional SNNs. Our design eliminates the need for multiplication due to the nature of FS coding. Second, we parallelize the computation required for the neuron at each time point to minimize the access of the weight data. Third, we fuse multiple spikes into one new spike to reduce inference delay and energy consumption. Our proposed architecture exhibits better performance and energy efficiency with less cost. Our experiments show that running MobileNet-V2, MF-DSNN achieves 6× to 22× energy efficiency improvements while having an accuracy degradation of less than 0.9% and using less silicon area on the ImageNet dataset compared to state-of-the-art artificial neural network accelerators.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131575534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Hardware-Centric Approach to Increase and Prune Regular Activation Sparsity in CNNs 一种以硬件为中心的cnn正则激活稀疏度增加和减少方法
Tim Hotfilter, Julian Höfer, Fabian Kreß, F. Kempf, Leonhard Kraft, T. Harbaum, J. Becker
{"title":"A Hardware-Centric Approach to Increase and Prune Regular Activation Sparsity in CNNs","authors":"Tim Hotfilter, Julian Höfer, Fabian Kreß, F. Kempf, Leonhard Kraft, T. Harbaum, J. Becker","doi":"10.1109/AICAS57966.2023.10168566","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168566","url":null,"abstract":"A key challenge in computing convolutional neural networks (CNNs) besides the vast number of computations are the associated numerous energy-intensive transactions from main to local memory. In this paper, we present our methodical approach to maximize and prune coarse-grained regular blockwise sparsity in activation feature maps during CNN inference on dedicated dataflow architectures. Regular sparsity that fits the target accelerator, e.g., a systolic array or vector processor, allows simplified and resource inexpensive pruning compared to irregular sparsity, saving memory transactions and computations. Our threshold-based technique allows maximizing the number of regular sparse blocks in each layer. The wide range of threshold combinations that result from the close correlation between the number of sparse blocks and network accuracy can be explored automatically by our exploration tool Spex. To harness found sparse blocks for memory transaction and MAC operation reduction, we also propose Sparse-Blox, a low-overhead hardware extension for common neural network hardware accelerators. Sparse-Blox adds up to 5× less area than state-of-the-art accelerator extensions that operate on irregular sparsity. Evaluation of our blockwise pruning method with Spex on ResNet-50 and Yolo-v5s shows a reduction of up to 18.9% and 12.6% memory transfers, and 802 M (19.0%) and 1.5 G (24.3%) MAC operations with a 1% or 1 mAP accuracy drop, respectively.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126446064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simulation-driven Latency Estimations for Multi-core Machine Learning Accelerators
Yannick Braatz, D. Rieber, T. Soliman, O. Bringmann
{"title":"Simulation-driven Latency Estimations for Multi-core Machine Learning Accelerators","authors":"Yannick Braatz, D. Rieber, T. Soliman, O. Bringmann","doi":"10.1109/AICAS57966.2023.10168589","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168589","url":null,"abstract":"Underutilization of compute resources leads to decreased performance of single-core machine learning (ML) accelerators. Therefore, multi-core accelerators divide the computational load among multiple smaller groups of processing elements (PEs), keeping more resources active in parallel. However, while producing higher throughput, the accelerator behavior becomes more complex. Supplying multiple cores with data demands adjustments to the on-chip memory hierarchy and direct memory access controller (DMAC) programming. Correctly estimating these effects becomes crucial for optimizing multi-core accelerators, especially in design space exploration (DSE). This work introduces a novel semi-simulated prediction methodology for latency estimations in multi-core ML accelerators. Simulating only dynamic system interactions while determining the latency of isolated accelerator elements analytically makes the proposed methodology precise and fast. We evaluate our methodology on an in-house configurable accelerator with various computational cores on two widely used convolutional neural networks (CNNs). We can estimate the accelerator latency with an average error of 4.7%.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129726303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Memristor-Inspired Computation for Epileptiform Signals in Spheroids 椭球中癫痫样信号的忆阻器启发计算
Ivan Diez-de-los-Rios, J. Ephraim, G. Palazzolo, T. Serrano-Gotarredona, G. Panuccio, B. Linares-Barranco
{"title":"A Memristor-Inspired Computation for Epileptiform Signals in Spheroids","authors":"Ivan Diez-de-los-Rios, J. Ephraim, G. Palazzolo, T. Serrano-Gotarredona, G. Panuccio, B. Linares-Barranco","doi":"10.1109/AICAS57966.2023.10168611","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168611","url":null,"abstract":"In this paper we present a memristor-inspired computational method for obtaining a type of running \"spectrogram\" or \"fingerprint\" of epileptiform activity generated by rodent hippocampal spheroids. It can be used to compute on the fly and with low computational cost an alert-level signal for epileptiform events onset. Here, we describe the computational method behind this \"fingerprint\" technique and illustrate it using epileptiform events recorded from hippocampal spheroids using a microelectrode array system.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133529207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Live Demonstration: SRAM Compute-In-Memory Based Visual & Aural Recognition System 现场演示:基于内存计算的SRAM视觉和听觉识别系统
Anjunyi Fan, Bo Hu, Zhonghua Jin, Haiyue Han, Yaojun Zhang, Yue Yang, Yuchao Yang, Bonan Yan, Ru Huang
{"title":"Live Demonstration: SRAM Compute-In-Memory Based Visual & Aural Recognition System","authors":"Anjunyi Fan, Bo Hu, Zhonghua Jin, Haiyue Han, Yaojun Zhang, Yue Yang, Yuchao Yang, Bonan Yan, Ru Huang","doi":"10.1109/AICAS57966.2023.10168569","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168569","url":null,"abstract":"We propose a live demonstration at AICAS’2023 for commercial SRAM compute-in-memory (CIM) accelerators. This live demonstration includes both visual and aural signal processing and classification performed by SRAM-based CIM engines. The visual part is a low-power face recognition platform, which can display and detect the audience’s faces in real-time. The aural part is a key word spotting engine, with which the audience can interact and control the device for designated tasks (such as \"volume up\" and \"volume down\"). This live demonstration is interactive and can bring a live feeling of energy efficiency improvement using the commercial CIM accelerators.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130174180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Hierarchically Reconfigurable SRAM-Based Compute-in-Memory Macro for Edge Computing 一种基于分层可重构sram的边缘计算宏
Runxi Wang, Xinfei Guo
{"title":"A Hierarchically Reconfigurable SRAM-Based Compute-in-Memory Macro for Edge Computing","authors":"Runxi Wang, Xinfei Guo","doi":"10.1109/AICAS57966.2023.10168564","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168564","url":null,"abstract":"AI running on the edge requires silicon that can meet demanding performance requirements while meeting the aggressive power and area budget. Frequently updated AI algorithms also demand matched processors to well employ their advantages. Compute-in-memory (CIM) architecture appears as a promising energy-efficient solution that completes the intensive computations in-situ where the data are stored. While prior works have shown great progress in designing SRAM-based CIM macros with fixed functionality that were tailored for specific AI applications, the flexibility reserved for wider usage scenarios is missing. In this paper, we propose a novel SRAM-based CIM macro that can be hierarchically configured to support various boolean operations, arithmetic operations, and macro operations. In addition, we demonstrate with an example that the proposed design can be expanded to support more essential edge computations with minimal overhead. Compared with the existing reconfigurable SRAM-based CIM macros, this work achieves a greater balance of reconfigurability vs. hardware cost by implementing flexibility at various design hierarchies.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114621182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FPGA-Based High-Speed and Resource-Efficient 3D Reconstruction for Structured Light System 基于fpga的结构光系统高速资源高效三维重建
Feng Bao, Zehua Dong, Jie Yu, Songping Mai
{"title":"FPGA-Based High-Speed and Resource-Efficient 3D Reconstruction for Structured Light System","authors":"Feng Bao, Zehua Dong, Jie Yu, Songping Mai","doi":"10.1109/AICAS57966.2023.10168616","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168616","url":null,"abstract":"To achieve high-speed and low-resource consumption 3D measurement, we propose a parallel and full-pipeline FPGA architecture for the phase measuring profilometry algorithm. The proposed system uses four-step phase-shifting and gray code decoding to generate accurate 3D point clouds. Experimental results show that the proposed architecture can process 12 frames of images with a resolution of 720 × 540 in just 12.2 ms, which is 110 times faster than the same implementation in software, and has the smallest resource consumption compared with other similar FPGA systems. This makes the proposed system very suitable for high-speed embedded 3D shape measurement applications.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132935430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信