深度卷积神经网络的高效可重构硬件结构

IF 0.6 4区工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

IEICE Transactions on Electronics Pub Date : 2020-01-01 DOI:10.1587/transele.2020cdp0002

Thi Diem Tran, Y. Nakashima

{"title":"深度卷积神经网络的高效可重构硬件结构","authors":"Thi Diem Tran, Y. Nakashima","doi":"10.1587/transele.2020cdp0002","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks (CNNs) have dominated a range of applications, from advanced manufacturing to autonomous cars. For energy cost-efficiency, developing low-power hardware for CNNs is a research trend. Due to the large input size, the first few convolutional layers generally consume most latency and hardware resources on hardware design. To address these challenges, this paper proposes an innovative architecture named SLIT to extract feature maps and reconstruct the first few layers on CNNs. In this reconstruction approach, total multiplyaccumulate operations are eliminated on the first layers. We evaluate new topology with MNIST, CIFAR, SVHN, and ImageNet datasets on image classification application. Latency and hardware resources of the inference step are evaluated on the chip ZC7Z020-1CLG484C FPGA with Lenet-5 and VGG schemes. On the Lenet-5 scheme, our architecture reduces 39% of latency and 70% of hardware resources with a 0.456 W power consumption compared to previous works. Even though the VGG models perform with a 10% reduction in hardware resources and latency, we hope our overall results will potentially give a new impetus for future studies to reach a higher optimization on hardware design. Notably, the SLIT architecture efficiently merges with most popular CNNs at a slightly sacrificing accuracy of a factor of 0.27% on MNIST, ranging from 0.5% to 1.5% on CIFAR, approximately 2.2% on ImageNet, and remaining the same on SVHN databases. key words: primary visual cortex, image classification, convolutional neural network, hardware architecture, FPGA, feature extraction","PeriodicalId":50384,"journal":{"name":"IEICE Transactions on Electronics","volume":"1 1","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"SLIT: An Energy-Efficient Reconfigurable Hardware Architecture for Deep Convolutional Neural Networks\",\"authors\":\"Thi Diem Tran, Y. Nakashima\",\"doi\":\"10.1587/transele.2020cdp0002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional neural networks (CNNs) have dominated a range of applications, from advanced manufacturing to autonomous cars. For energy cost-efficiency, developing low-power hardware for CNNs is a research trend. Due to the large input size, the first few convolutional layers generally consume most latency and hardware resources on hardware design. To address these challenges, this paper proposes an innovative architecture named SLIT to extract feature maps and reconstruct the first few layers on CNNs. In this reconstruction approach, total multiplyaccumulate operations are eliminated on the first layers. We evaluate new topology with MNIST, CIFAR, SVHN, and ImageNet datasets on image classification application. Latency and hardware resources of the inference step are evaluated on the chip ZC7Z020-1CLG484C FPGA with Lenet-5 and VGG schemes. On the Lenet-5 scheme, our architecture reduces 39% of latency and 70% of hardware resources with a 0.456 W power consumption compared to previous works. Even though the VGG models perform with a 10% reduction in hardware resources and latency, we hope our overall results will potentially give a new impetus for future studies to reach a higher optimization on hardware design. Notably, the SLIT architecture efficiently merges with most popular CNNs at a slightly sacrificing accuracy of a factor of 0.27% on MNIST, ranging from 0.5% to 1.5% on CIFAR, approximately 2.2% on ImageNet, and remaining the same on SVHN databases. key words: primary visual cortex, image classification, convolutional neural network, hardware architecture, FPGA, feature extraction\",\"PeriodicalId\":50384,\"journal\":{\"name\":\"IEICE Transactions on Electronics\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2020-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEICE Transactions on Electronics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1587/transele.2020cdp0002\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEICE Transactions on Electronics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1587/transele.2020cdp0002","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 2

摘要

卷积神经网络(cnn)已经主导了从先进制造业到自动驾驶汽车的一系列应用。考虑到能源成本效益，为cnn开发低功耗硬件是一个研究趋势。由于输入规模较大，前几个卷积层通常在硬件设计上消耗最多的延迟和硬件资源。为了解决这些挑战，本文提出了一种名为SLIT的创新架构来提取cnn上的特征映射并重建前几层。在这种重建方法中，在第一层上消除了总乘法累积操作。我们用MNIST、CIFAR、SVHN和ImageNet数据集评估了新拓扑在图像分类中的应用。在芯片ZC7Z020-1CLG484C FPGA上，采用Lenet-5和VGG方案对推理步骤的延迟和硬件资源进行了评估。在Lenet-5方案上，我们的架构与以前的工作相比减少了39%的延迟和70%的硬件资源，功耗为0.456 W。尽管VGG模型在硬件资源和延迟方面减少了10%，但我们希望我们的总体结果可能会为未来的研究提供新的动力，以达到更高的硬件设计优化。值得注意的是，SLIT架构有效地与大多数流行的cnn合并，但在MNIST上略微牺牲了0.27%的精度，在CIFAR上从0.5%到1.5%不等，在ImageNet上大约2.2%，在SVHN数据库上保持不变。关键词:初级视觉皮层，图像分类，卷积神经网络，硬件架构，FPGA，特征提取

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SLIT: An Energy-Efficient Reconfigurable Hardware Architecture for Deep Convolutional Neural Networks

Convolutional neural networks (CNNs) have dominated a range of applications, from advanced manufacturing to autonomous cars. For energy cost-efficiency, developing low-power hardware for CNNs is a research trend. Due to the large input size, the first few convolutional layers generally consume most latency and hardware resources on hardware design. To address these challenges, this paper proposes an innovative architecture named SLIT to extract feature maps and reconstruct the first few layers on CNNs. In this reconstruction approach, total multiplyaccumulate operations are eliminated on the first layers. We evaluate new topology with MNIST, CIFAR, SVHN, and ImageNet datasets on image classification application. Latency and hardware resources of the inference step are evaluated on the chip ZC7Z020-1CLG484C FPGA with Lenet-5 and VGG schemes. On the Lenet-5 scheme, our architecture reduces 39% of latency and 70% of hardware resources with a 0.456 W power consumption compared to previous works. Even though the VGG models perform with a 10% reduction in hardware resources and latency, we hope our overall results will potentially give a new impetus for future studies to reach a higher optimization on hardware design. Notably, the SLIT architecture efficiently merges with most popular CNNs at a slightly sacrificing accuracy of a factor of 0.27% on MNIST, ranging from 0.5% to 1.5% on CIFAR, approximately 2.2% on ImageNet, and remaining the same on SVHN databases. key words: primary visual cortex, image classification, convolutional neural network, hardware architecture, FPGA, feature extraction

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEICE Transactions on Electronics 工程技术-工程：电子与电气

CiteScore

1.00

自引率

20.00%

发文量

审稿时长

3-6 weeks

期刊介绍： Currently, the IEICE has ten sections nationwide. Each section operates under the leadership of a section chief, four section secretaries and about 20 section councilors. Sections host lecture meetings, seminars and industrial tours, and carry out other activities. Topics: Integrated Circuits, Semiconductor Materials and Devices, Quantum Electronics, Opto-Electronics, Superconductive Electronics, Electronic Displays, Microwave and Millimeter Wave Technologies, Vacuum and Beam Technologies, Recording and Memory Technologies, Electromagnetic Theory.