基于fpga的近似计算加速边缘机器学习

IF 0.1 Q4 MULTIDISCIPLINARY SCIENCES

Tecnologia en Marcha Pub Date : 2022-11-28 DOI:10.18845/tm.v35i9.6491

Luis Gerardo León-Vega, Eduardo Salazar-Villalobos, Jorge Castro-Godínez

{"title":"基于fpga的近似计算加速边缘机器学习","authors":"Luis Gerardo León-Vega, Eduardo Salazar-Villalobos, Jorge Castro-Godínez","doi":"10.18845/tm.v35i9.6491","DOIUrl":null,"url":null,"abstract":"Performing inference of complex machine learning (ML) algorithms at the edge is becoming important to unlink the system functionality from the cloud. However, the ML models increase complexity faster than the available hardware resources. This research aims to accelerate machine learning by offloading the computation to low-end FPGAs and using approximate computing techniques to optimise resource usage, taking advantage of the inaccurate nature of machine learning models. In this paper, we propose a generic matrix multiply-add processing element design, parameterised in datatype, matrix size, and data width. We evaluate the resource consumption and error behaviour while varying the matrix size and the data width given a fixed-point data type. We determine that the error scales with the matrix size, but it can be compensated by increasing the data width, posing a trade-off between data width and matrix size with respect to the error.","PeriodicalId":42957,"journal":{"name":"Tecnologia en Marcha","volume":"328 1","pages":""},"PeriodicalIF":0.1000,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accelerating machine learning at the edge with approximate computing on FPGAs\",\"authors\":\"Luis Gerardo León-Vega, Eduardo Salazar-Villalobos, Jorge Castro-Godínez\",\"doi\":\"10.18845/tm.v35i9.6491\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Performing inference of complex machine learning (ML) algorithms at the edge is becoming important to unlink the system functionality from the cloud. However, the ML models increase complexity faster than the available hardware resources. This research aims to accelerate machine learning by offloading the computation to low-end FPGAs and using approximate computing techniques to optimise resource usage, taking advantage of the inaccurate nature of machine learning models. In this paper, we propose a generic matrix multiply-add processing element design, parameterised in datatype, matrix size, and data width. We evaluate the resource consumption and error behaviour while varying the matrix size and the data width given a fixed-point data type. We determine that the error scales with the matrix size, but it can be compensated by increasing the data width, posing a trade-off between data width and matrix size with respect to the error.\",\"PeriodicalId\":42957,\"journal\":{\"name\":\"Tecnologia en Marcha\",\"volume\":\"328 1\",\"pages\":\"\"},\"PeriodicalIF\":0.1000,\"publicationDate\":\"2022-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Tecnologia en Marcha\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18845/tm.v35i9.6491\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tecnologia en Marcha","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18845/tm.v35i9.6491","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

在边缘执行复杂机器学习(ML)算法的推理对于将系统功能与云断开连接变得越来越重要。然而，ML模型增加复杂性的速度比可用的硬件资源要快。本研究旨在通过将计算卸载到低端fpga并使用近似计算技术来优化资源使用，利用机器学习模型的不准确性来加速机器学习。在本文中，我们提出了一种通用的矩阵乘加处理元素设计，参数化了数据类型、矩阵大小和数据宽度。我们评估资源消耗和错误行为，同时改变矩阵大小和给定定点数据类型的数据宽度。我们确定误差随矩阵大小而变化，但它可以通过增加数据宽度来补偿，在数据宽度和矩阵大小之间就误差进行权衡。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Accelerating machine learning at the edge with approximate computing on FPGAs

Performing inference of complex machine learning (ML) algorithms at the edge is becoming important to unlink the system functionality from the cloud. However, the ML models increase complexity faster than the available hardware resources. This research aims to accelerate machine learning by offloading the computation to low-end FPGAs and using approximate computing techniques to optimise resource usage, taking advantage of the inaccurate nature of machine learning models. In this paper, we propose a generic matrix multiply-add processing element design, parameterised in datatype, matrix size, and data width. We evaluate the resource consumption and error behaviour while varying the matrix size and the data width given a fixed-point data type. We determine that the error scales with the matrix size, but it can be compensated by increasing the data width, posing a trade-off between data width and matrix size with respect to the error.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Tecnologia en Marcha MULTIDISCIPLINARY SCIENCES-

自引率

0.00%

发文量

审稿时长

28 weeks