使用零通乘法器的CNN架构的可重构硬件实现

K. Sakthi, D. Abishek., M. Arun Kumar
{"title":"使用零通乘法器的CNN架构的可重构硬件实现","authors":"K. Sakthi, D. Abishek., M. Arun Kumar","doi":"10.1109/ICCCI56745.2023.10128256","DOIUrl":null,"url":null,"abstract":"The current state-of-the-art in image recognition, segmentation, and localization algorithms has reached an extremely high level of accuracy thanks to the development of deep neural networks and their use in computer vision problems. Specifically, Convolutional Neural Networks (CNNs) have reached human-level performance in image classification and detection. CNN must be executed on a portable, low-cost, and low-power-consuming device for the object classification/detection use cases. However, real-time execution of CNN based applications is hindered by these devices’ limited computing resources and low on-board memory storage capability. In this dissertation, we introduce hardware-efficient algorithms for performing complex computations in CNN at low cost and with minimal power consumption. Additionally, we provide efficient VLSI architectures based on systolic arrays for creating high-performance devices for running CNN.The limitations of traditional CNN for detecting occluded objects are also outlined in this thesis. We propose an improved CNN with self-feedback layers and present an algorithm to increase the detection accuracy of the hidden objects. Improved accuracy in detecting hidden objects is found when the enhanced CNN is compared to the gold standard dataset. In addition, enhanced CNN requires a much larger number of computations to execute than regular CNN. We introduce a low-cost, low-power VLSI architecture design for efficient hardware execution of improved CNN.","PeriodicalId":205683,"journal":{"name":"2023 International Conference on Computer Communication and Informatics (ICCCI)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reconfigurable Hardware Implementation of CNN Architecture using Zerobypass Multiplier\",\"authors\":\"K. Sakthi, D. Abishek., M. Arun Kumar\",\"doi\":\"10.1109/ICCCI56745.2023.10128256\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The current state-of-the-art in image recognition, segmentation, and localization algorithms has reached an extremely high level of accuracy thanks to the development of deep neural networks and their use in computer vision problems. Specifically, Convolutional Neural Networks (CNNs) have reached human-level performance in image classification and detection. CNN must be executed on a portable, low-cost, and low-power-consuming device for the object classification/detection use cases. However, real-time execution of CNN based applications is hindered by these devices’ limited computing resources and low on-board memory storage capability. In this dissertation, we introduce hardware-efficient algorithms for performing complex computations in CNN at low cost and with minimal power consumption. Additionally, we provide efficient VLSI architectures based on systolic arrays for creating high-performance devices for running CNN.The limitations of traditional CNN for detecting occluded objects are also outlined in this thesis. We propose an improved CNN with self-feedback layers and present an algorithm to increase the detection accuracy of the hidden objects. Improved accuracy in detecting hidden objects is found when the enhanced CNN is compared to the gold standard dataset. In addition, enhanced CNN requires a much larger number of computations to execute than regular CNN. We introduce a low-cost, low-power VLSI architecture design for efficient hardware execution of improved CNN.\",\"PeriodicalId\":205683,\"journal\":{\"name\":\"2023 International Conference on Computer Communication and Informatics (ICCCI)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Conference on Computer Communication and Informatics (ICCCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCCI56745.2023.10128256\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Computer Communication and Informatics (ICCCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCI56745.2023.10128256","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

由于深度神经网络的发展及其在计算机视觉问题中的应用,目前在图像识别、分割和定位算法方面的最新技术已经达到了极高的精度水平。具体来说,卷积神经网络(cnn)在图像分类和检测方面的性能已经达到了人类的水平。对于目标分类/检测用例,CNN必须在便携式、低成本和低功耗的设备上执行。然而,基于CNN的应用程序的实时执行受到这些设备有限的计算资源和低板载内存存储能力的阻碍。在本文中,我们介绍了在CNN中以低成本和最小功耗执行复杂计算的硬件高效算法。此外,我们还提供基于收缩阵列的高效VLSI架构,用于创建运行CNN的高性能设备。本文还概述了传统CNN检测遮挡物的局限性。我们提出了一种改进的带有自反馈层的CNN,并提出了一种提高隐藏目标检测精度的算法。将增强后的CNN与黄金标准数据集进行比较,发现检测隐藏物体的准确性得到了提高。此外,增强型CNN需要执行的计算量比常规CNN大得多。我们介绍了一种低成本,低功耗的VLSI架构设计,用于高效的硬件执行改进的CNN。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Reconfigurable Hardware Implementation of CNN Architecture using Zerobypass Multiplier
The current state-of-the-art in image recognition, segmentation, and localization algorithms has reached an extremely high level of accuracy thanks to the development of deep neural networks and their use in computer vision problems. Specifically, Convolutional Neural Networks (CNNs) have reached human-level performance in image classification and detection. CNN must be executed on a portable, low-cost, and low-power-consuming device for the object classification/detection use cases. However, real-time execution of CNN based applications is hindered by these devices’ limited computing resources and low on-board memory storage capability. In this dissertation, we introduce hardware-efficient algorithms for performing complex computations in CNN at low cost and with minimal power consumption. Additionally, we provide efficient VLSI architectures based on systolic arrays for creating high-performance devices for running CNN.The limitations of traditional CNN for detecting occluded objects are also outlined in this thesis. We propose an improved CNN with self-feedback layers and present an algorithm to increase the detection accuracy of the hidden objects. Improved accuracy in detecting hidden objects is found when the enhanced CNN is compared to the gold standard dataset. In addition, enhanced CNN requires a much larger number of computations to execute than regular CNN. We introduce a low-cost, low-power VLSI architecture design for efficient hardware execution of improved CNN.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信