2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)最新文献

筛选
英文 中文
Performance Assessment of an Extremely Energy-Efficient Binary Neural Network Using Adiabatic Superconductor Devices 利用绝热超导器件的极节能二元神经网络的性能评估
O. Chen, Z. Li, Tomoharu Yamauchi, Yanzhi Wang, N. Yoshikawa
{"title":"Performance Assessment of an Extremely Energy-Efficient Binary Neural Network Using Adiabatic Superconductor Devices","authors":"O. Chen, Z. Li, Tomoharu Yamauchi, Yanzhi Wang, N. Yoshikawa","doi":"10.1109/AICAS57966.2023.10168607","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168607","url":null,"abstract":"Binary Neural Networks (BNNs) are gaining popularity for solving real-world problems using Deep Neural Networks (DNNs), such as image recognition and natural language processing. BNNs use binary precision for weights and activations, reducing memory usage by 32 times compared to conventional networks using 32-bit floating-point precision. Among various types of BNNs, AQFP-based BNNs utilizing superconducting logic families are promising for energy-efficient computing, using magnetic flux quantization and quantum interference in Josephson-junction-based superconductor loops. This paper presents a performance assessment of a novel AQFP-based BNN architecture, highlighting scalability issues caused by increased inductance in the analog accumulation circuit. We also discuss potential optimization approaches to address these issues and improve scalability.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114567020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bringing Touch to the Edge: A Neuromorphic Processing Approach For Event-Based Tactile Systems 将触觉带到边缘:基于事件的触觉系统的神经形态处理方法
Harshil Patel, Anup Vanarse, Kristofor D. Carlson, A. Osseiran
{"title":"Bringing Touch to the Edge: A Neuromorphic Processing Approach For Event-Based Tactile Systems","authors":"Harshil Patel, Anup Vanarse, Kristofor D. Carlson, A. Osseiran","doi":"10.1109/AICAS57966.2023.10168592","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168592","url":null,"abstract":"The rise of neuromorphic applications has highlighted the remarkable potential of biologically-inspired systems. Despite significant advancements in audio and visual technologies, research directed towards tactile sensing has not been as extensive. We propose a neuromorphic tactile system for sensing and processing that presents promising results for edge devices and applications. In this study, a neuromorphic tactile sensor, two data encoding techniques, and a two-layer spiking neural network (SNN) deployed on the AKD1000 Akida Neuromorphic System on Chip (NSoC) were used to demonstrate the system's capabilities. Results from experiments on the ST-MNIST dataset showed high accuracy, with the complement-coded variant achieving 93.1%, outperforming previous state-of-the-art models for this dataset. Additionally, an exploratory study showed that early classification was possible, with most samples requiring only 38% of the available events to classify correctly, reducing the amount of data that needs to be processed. The low power consumption and high throughput of both SNN models, with an average dynamic power consumption of 6.37 mW and 7.76 mW and an average throughput of 586 and 589 frames-per-second respectively, make the proposed system suitable for edge devices with limited power and processing resources. Overall, the proposed tactile sensing system presents a promising solution for edge applications that require high accuracy, low power consumption, and high throughput.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116940887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PN-TMS: Pruned Node-fusion Tree-based Multicast Scheme for Efficient Neuromorphic Systems 高效神经形态系统中基于修剪节点融合树的组播方案
Ziyang Shen, Chaoming Fang, Fengshi Tian, Jie Yang, M. Sawan
{"title":"PN-TMS: Pruned Node-fusion Tree-based Multicast Scheme for Efficient Neuromorphic Systems","authors":"Ziyang Shen, Chaoming Fang, Fengshi Tian, Jie Yang, M. Sawan","doi":"10.1109/AICAS57966.2023.10168590","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168590","url":null,"abstract":"A growing demand for low-power and real-time computation is motivating the development of dedicated neuromorphic processors. To maximize scalability and power efficiency, multicore architecture has been broadly applied in existing neuromorphic processors. Nevertheless, mapping a Spiking Neural Network (SNN) on a multicore architecture requires a lot of multicast operations. Conventional routing algorithms like path-based routing and dimension order routing (DOR) lead to a severe overhead in both latency and power. To address these limitations, we propose a novel routing algorithm named Pruned Node-fusion Tree-based Multicast Scheme (PN-TMS). PN-TMS leverages multiple algorithms for route planning, optimizing latency and power simultaneously. Experiment results show that PN-TMS outperforms existing network processors’ routing schemes in terms of both energy consumption and latency, achieves an average energy delay product (EDP) reduction of 38.9%.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117090266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Context Swap: Multi-PIM System Preventing Remote Memory Access for Large Embedding Model Acceleration 上下文交换:多pim系统防止大型嵌入模型加速的远程内存访问
Hong Kal, Cheolhwan Kim, Minjae Kim, W. Ro
{"title":"Context Swap: Multi-PIM System Preventing Remote Memory Access for Large Embedding Model Acceleration","authors":"Hong Kal, Cheolhwan Kim, Minjae Kim, W. Ro","doi":"10.1109/AICAS57966.2023.10168595","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168595","url":null,"abstract":"Processing-in-Memory (PIM) has been an attractive solution to accelerate memory-intensive neural network layers. Especially, PIM is efficient for layers using embeddings, such as the embedding layer and graph convolution layer, because of their large capacity and low arithmetic intensity. The embedding tables of such layers are stored across multiple memory nodes and processed by local PIM modules with sparse access patterns. Towards computing data from other memory nodes on a local PIM module, a naive approach is to allow the local PIM to retrieve data from remote memory nodes. This approach might incur significant performance degradation due to the long latency overhead of remote accesses. To avoid remote access, PIM system can adopt a framework based on MapReduce programming model, which enables PIMs to compute the local data only and CPUs to compute intermediate results of PIMs. However, the multi-PIM system still suffers from performance degradation because the framework is processed on the CPU and it has a long delay compared to the PIM kernel execution. Therefore, we propose a context swap technique that prevents remote data access even without a high-latency framework. We observe that transferring PIM contexts to the remote PIM node needs much fewer data traffic than remote accesses of data. Our PIM system makes PIM nodes swap their context data with each other when they complete their own computation and no longer have local data to compute. Until all PIMs calculate all local data, several context swaps occur. The context swap is performed by a memory controller between PIMs in the same CPU socket and simple software between PIMs in different CPU sockets. To this end, the proposed multi-PIM system outperforms the base PIM system transferring remote data and the PIM system with the kernel-managing framework by 4.1 × and 3.3 ×, respectively.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124024851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Demonstration Platform for Large-Scaled Point Cloud Network Based on 28nm 2D/3D Unified Sparse Convolution Accelerator 基于28nm二维/三维统一稀疏卷积加速器的大规模点云网络演示平台
Xiaoyu Feng, Wenyu Sun, Shupei Fan, Chen Tang, Yixiong Yang, Jinshan Yue, Q. Liao, Huazhong Yang, Yongpan Liu
{"title":"A Demonstration Platform for Large-Scaled Point Cloud Network Based on 28nm 2D/3D Unified Sparse Convolution Accelerator","authors":"Xiaoyu Feng, Wenyu Sun, Shupei Fan, Chen Tang, Yixiong Yang, Jinshan Yue, Q. Liao, Huazhong Yang, Yongpan Liu","doi":"10.1109/AICAS57966.2023.10168558","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168558","url":null,"abstract":"3D point cloud processing plays an important role in many emerging applications such as autonomous driving, visual navigation, and virtual reality. It calls for hardware acceleration of multiple key operations, including 3D Submanifold SCONV, 3D non-Submanifold SCONV, and 2D SCONV. This work presents a 2D/3D unified sparse convolution accelerator for large-scale voxel-based point cloud networks. The chip is fabricated in TSMC 28nm CMOS technology to achieve 3.3-16.9 FPS running from 60-400MHz when computing the SECOND network on KITTI dataset. This work has been included by ISSCC2023 [1]. A demonstration is given to show the real-time 3D processing with a lidar sensor.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124793176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FEEP: Functional ECO Synthesis with Efficient Patch Minimization FEEP:高效贴片最小化的功能性生态合成
Yaotian Liu, Yuhang Zhang, Qing Zhang, Rui Chen, Yongfu Li
{"title":"FEEP: Functional ECO Synthesis with Efficient Patch Minimization","authors":"Yaotian Liu, Yuhang Zhang, Qing Zhang, Rui Chen, Yongfu Li","doi":"10.1109/AICAS57966.2023.10168557","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168557","url":null,"abstract":"Functional engineering change order (ECO) has been an essential process in modern complex integrated circuit design. Finding a high-quality circuit patch efficiently has long been a challenge. This paper proposes FEEP, an automatic and efficient synthesis-based functional ECO method. Structural pruning and stratified searching techniques are proposed to minimize search space without extra logical equivalence checks. Moreover, we propose a machine-learning-based two-stage patch size predictor that assists in predicting patch quality. Experimental results show that our algorithm can efficiently search and produce high-quality patches under various test cases.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129668853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficiency Comparison of Machine Learning Algorithms for EEG Interpretation 脑电解释中机器学习算法的效率比较
Xia Han, F. Amiel, Xun Zhang, Kunni Wei, Cong Yan, Wenjun Hu, Zefeng Wang
{"title":"Efficiency Comparison of Machine Learning Algorithms for EEG Interpretation","authors":"Xia Han, F. Amiel, Xun Zhang, Kunni Wei, Cong Yan, Wenjun Hu, Zefeng Wang","doi":"10.1109/AICAS57966.2023.10168626","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168626","url":null,"abstract":"This paper intends to use a small protocol to detect stroke disease on a patient by using signals provided by only three EEG probes. To achieve this objective, we compare the performances in terms of accuracy and time of six machine learning (ML) algorithms (Random Forest, Logistic Regression, Support Vector Machine, K-Nearest Neighbor, Decision Tree and CatBoost) during a process of EEG-based classification pathology. We use a database of EEG recording signals collected by three electrodes, established by Beijing University of Chinese Medicine and carried out on subjects healthy or affected by strokes when they are exposed to the vision of planes of five different colors. The subjects are known to be healthy or affected by strokes. The records are used to train each algorithm for 70% of the population, and the performances are estimated on the remaining 30%. Then the process is repeated one hundred times when changing the set used for training and the set used to test. We then consider a statistic on the results obtained using each method for comparison. Our results show that the SVM algorithm is the most efficient in terms of the accuracy of the results, and can detect stoke disease with a reliability of 70%.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"172 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121314415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image Recovery Through Scattering Media via GAN Reconstruction and SNES Optimization 基于GAN重构和SNES优化的散射介质图像恢复
Pengfei Qi, Yuanjin Zheng
{"title":"Image Recovery Through Scattering Media via GAN Reconstruction and SNES Optimization","authors":"Pengfei Qi, Yuanjin Zheng","doi":"10.1109/AICAS57966.2023.10168553","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168553","url":null,"abstract":"Optical image recovery through scattering media is a significant yet challenging problem. Iterative wavefront shaping is one of the powerful tools to re-distribute the diffusive light and compensate for the diffuser by controlling the incident wavefront. However, in the scenario that only a feedback signal on the camera can be obtained, this technology would fail due to the lack of target images. In this paper, we propose a new scheme for recovering images through scattering media in an absence of target images. In particular, we employ an improved Generative Adversarial Network (GAN) for computational reconstruction and separable natural evolution strategy (SNES) for wavefront shaping optimization. Both simulation and experimental results suggest that the proposed scheme will open up new opportunities in the applications of biomedical imaging, optical encryption, holographic display, etc.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"324 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122975450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KP2Dtiny: Quantized Neural Keypoint Detection and Description on the Edge KP2Dtiny:边缘的量化神经关键点检测与描述
Thomas Rüegg, Marco Giordano, Michele Magno
{"title":"KP2Dtiny: Quantized Neural Keypoint Detection and Description on the Edge","authors":"Thomas Rüegg, Marco Giordano, Michele Magno","doi":"10.1109/AICAS57966.2023.10168598","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168598","url":null,"abstract":"Detection and description of keypoints in images is a fundamental component of a wide range of tasks such as Simultaneous Localization And Mapping (SLAM), image alignment and structure from motion (SfM). Efficient computation of these features is crucial for real-time applications and has been addressed by multiple handcrafted algorithms and, recently, by deep neural network-based detectors. Learned detectors achieve high detection performance, but pose high computational requirements, making them slow and impractical for low-power resource constraint platforms. This paper presents a quantized neural keypoint detector and descriptor optimized for edge devices exploiting two recent AI platforms such as MAX78000 by Analog Devices and the Coral AI USB accelerator from Google. To accommodate the diverse constraints and requirements of various applications, we propose and evaluate two model architectures (KP2DtinySmall and KP2DtinyFast) and deploy them on the aforementioned platforms using full 8-bit integer quantization. Furthermore, we extensively evaluate these models in terms of power, latency and accuracy, reporting results on three image sizes (88x88, 320x240 and 640x480), evaluating both quantized and non-quantized models. Fully quantized, KP2DtinySmall reduces network size by a factor of 54x while improving homographic estimation accuracy on 88x88 images on the most stringent threshold (Correctness d1) by 32.4% (0.550) and on 320x240 images by 10.7% (0.648) compared to the KeypointNet architecture by Yang You et. al. This result is achieved by designing a new network with low power platforms in mind, particularly addressing the lower resolution by increasing the density of detectable features. Deployed on the MAX78000 MCU, inference of low-resolution images is run at 59 FPS, consuming 1.1 mJ per image. On the Coral usb accelerator, KP2DtinyFast runs inference on low-resolution images at 527 FPS consuming 3.1 mJ, on high resolution it achieves 70 FPS at 19.9 mJ per inference.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126459845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Low-Power Hardware Accelerator of MFCC Extraction for Keyword Spotting in 22nm FDSOI 一种用于22nm FDSOI关键字提取的低功耗硬件加速器
Liyuan Guo, M. Jobst, J. Partzsch, Stefan Scholze, Andreas Dixius, Matthias Lohrmann, S. Zeinolabedin, C. Mayr
{"title":"A Low-Power Hardware Accelerator of MFCC Extraction for Keyword Spotting in 22nm FDSOI","authors":"Liyuan Guo, M. Jobst, J. Partzsch, Stefan Scholze, Andreas Dixius, Matthias Lohrmann, S. Zeinolabedin, C. Mayr","doi":"10.1109/AICAS57966.2023.10168587","DOIUrl":"https://doi.org/10.1109/AICAS57966.2023.10168587","url":null,"abstract":"With the development of artificial intelligence, the real-time feature extraction of acoustic signals is required in a wide variety of applications, such as keyword spotting and speech recognition. Feature extraction based on Mel-frequency cepstral coefficients (MFCCs) is one of the most significant methods thereinto. A software implementation of the MFCC extraction results in relatively high power consumption and computational time limitation, often making it unsuitable for tiny battery powered devices. Therefore, an on-chip accelerator of MFCC extraction is of interest in cutting-edge scenarios. This paper presents a fixed-point low-power hardware accelerator of MFCC feature extraction implemented in 22nm FDSOI technology. It consumes an average power of 2.78µW for 1024-sample frame at a clock frequency of 1MHz. For keyword spotting, the quantized accelerator achieves an average accuracy of around 96% working along with different classification networks.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121778994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信