Review of neural network model acceleration techniques based on FPGA platforms

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2024-08-31 DOI:10.1016/j.neucom.2024.128511

{"title":"Review of neural network model acceleration techniques based on FPGA platforms","authors":"","doi":"10.1016/j.neucom.2024.128511","DOIUrl":null,"url":null,"abstract":"<div><p>Neural network models, celebrated for their outstanding scalability and computational capabilities, have demonstrated remarkable performance across various fields such as vision, language, and multimodality. The rapid advancements in neural networks, fueled by the deep development of Internet technology and the increasing demand for intelligent edge devices, introduce new challenges, including significant model parameter sizes and increased storage pressures. In this context, Field-Programmable Gate Arrays (FPGA) emerge as a preferred platform for accelerating neural network models, thanks to their exceptional performance, energy efficiency, and the flexibility and scalability of the system. Building FPGA-based neural network systems necessitates bridging significant differences in objectives, methods, and design spaces between model design and hardware design. This review article adopts a comprehensive analytical framework to thoroughly explore multidimensional technological implementation strategies, encompassing optimizations at the algorithmic and hardware levels, as well as compiler optimization techniques. It focuses on methods for collaborative optimization between algorithms and hardware, identifies challenges in the collaborative design process, and proposes corresponding implementation strategies and key steps. Addressing various technological dimensions, the article provides in-depth technical analysis and discussion, aiming to offer valuable insights for research on optimizing and accelerating neural network models in edge computing environments.</p></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5000,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224012827","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Neural network models, celebrated for their outstanding scalability and computational capabilities, have demonstrated remarkable performance across various fields such as vision, language, and multimodality. The rapid advancements in neural networks, fueled by the deep development of Internet technology and the increasing demand for intelligent edge devices, introduce new challenges, including significant model parameter sizes and increased storage pressures. In this context, Field-Programmable Gate Arrays (FPGA) emerge as a preferred platform for accelerating neural network models, thanks to their exceptional performance, energy efficiency, and the flexibility and scalability of the system. Building FPGA-based neural network systems necessitates bridging significant differences in objectives, methods, and design spaces between model design and hardware design. This review article adopts a comprehensive analytical framework to thoroughly explore multidimensional technological implementation strategies, encompassing optimizations at the algorithmic and hardware levels, as well as compiler optimization techniques. It focuses on methods for collaborative optimization between algorithms and hardware, identifies challenges in the collaborative design process, and proposes corresponding implementation strategies and key steps. Addressing various technological dimensions, the article provides in-depth technical analysis and discussion, aiming to offer valuable insights for research on optimizing and accelerating neural network models in edge computing environments.

查看原文本刊更多论文

基于 FPGA 平台的神经网络模型加速技术综述

神经网络模型以其出色的可扩展性和计算能力而著称，在视觉、语言和多模态等各个领域都表现出了卓越的性能。互联网技术的深入发展和对智能边缘设备日益增长的需求推动了神经网络的快速发展，同时也带来了新的挑战，包括巨大的模型参数规模和更大的存储压力。在这种情况下，现场可编程门阵列（FPGA）凭借其卓越的性能、能效以及系统的灵活性和可扩展性，成为加速神经网络模型的首选平台。构建基于 FPGA 的神经网络系统需要弥合模型设计和硬件设计在目标、方法和设计空间上的显著差异。本综述文章采用综合分析框架，深入探讨多维技术实现策略，包括算法和硬件层面的优化以及编译器优化技术。文章重点探讨了算法与硬件协同优化的方法，明确了协同设计过程中的挑战，并提出了相应的实施策略和关键步骤。文章针对不同的技术维度，进行了深入的技术分析和讨论，旨在为边缘计算环境下优化和加速神经网络模型的研究提供有价值的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.