面向嵌入式FPGA部署的微型神经网络搜索

2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2023-06-11 DOI:10.1109/AICAS57966.2023.10168571

Haiyan Qin, Yejun Zeng, Jinyu Bai, Wang Kang

{"title":"面向嵌入式FPGA部署的微型神经网络搜索","authors":"Haiyan Qin, Yejun Zeng, Jinyu Bai, Wang Kang","doi":"10.1109/AICAS57966.2023.10168571","DOIUrl":null,"url":null,"abstract":"Embedded FPGAs have become increasingly popular as acceleration platforms for the deployment of edge-side artificial intelligence (AI) applications, due in part to their flexible and configurable heterogeneous architectures. However, the complex deployment process hinders the realization of AI democratization, particularly at the edge. In this paper, we propose a software-hardware co-design framework that enables simultaneous searching for neural network architectures and corresponding accelerator designs on embedded FPGAs. The proposed framework comprises a hardware-friendly neural architecture search space, a reconfigurable streaming-based accelerator architecture, and a model performance estimator. An evolutionary algorithm targeting multi-objective optimization is employed to identify the optimal neural architecture and corresponding accelerator design. We evaluate our framework on various datasets and demonstrate that, in a typical edge AI scenario, the searched network and accelerator can achieve up to a 2.9% accuracy improvement and up to a 21 speedup compared to manually designed networks based on× common accelerator designs when deployed on a widely used embedded FPGA (Xilinx XC7Z020).","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Searching Tiny Neural Networks for Deployment on Embedded FPGA\",\"authors\":\"Haiyan Qin, Yejun Zeng, Jinyu Bai, Wang Kang\",\"doi\":\"10.1109/AICAS57966.2023.10168571\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Embedded FPGAs have become increasingly popular as acceleration platforms for the deployment of edge-side artificial intelligence (AI) applications, due in part to their flexible and configurable heterogeneous architectures. However, the complex deployment process hinders the realization of AI democratization, particularly at the edge. In this paper, we propose a software-hardware co-design framework that enables simultaneous searching for neural network architectures and corresponding accelerator designs on embedded FPGAs. The proposed framework comprises a hardware-friendly neural architecture search space, a reconfigurable streaming-based accelerator architecture, and a model performance estimator. An evolutionary algorithm targeting multi-objective optimization is employed to identify the optimal neural architecture and corresponding accelerator design. We evaluate our framework on various datasets and demonstrate that, in a typical edge AI scenario, the searched network and accelerator can achieve up to a 2.9% accuracy improvement and up to a 21 speedup compared to manually designed networks based on× common accelerator designs when deployed on a widely used embedded FPGA (Xilinx XC7Z020).\",\"PeriodicalId\":296649,\"journal\":{\"name\":\"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)\",\"volume\":\"77 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AICAS57966.2023.10168571\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICAS57966.2023.10168571","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

嵌入式fpga作为边缘人工智能(AI)应用部署的加速平台越来越受欢迎，部分原因是其灵活且可配置的异构架构。然而，复杂的部署过程阻碍了人工智能民主化的实现，特别是在边缘。在本文中，我们提出了一个软硬件协同设计框架，可以同时搜索嵌入式fpga上的神经网络架构和相应的加速器设计。该框架包括一个硬件友好的神经结构搜索空间、一个可重构的基于流的加速器结构和一个模型性能估计器。采用多目标优化进化算法确定最优神经结构和相应的加速器设计。我们在各种数据集上评估了我们的框架，并证明，在典型的边缘人工智能场景中，当部署在广泛使用的嵌入式FPGA (Xilinx XC7Z020)上时，与基于通用加速器设计的手动设计网络相比，搜索网络和加速器可以实现高达2.9%的精度提高和高达21%的加速。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Searching Tiny Neural Networks for Deployment on Embedded FPGA

Embedded FPGAs have become increasingly popular as acceleration platforms for the deployment of edge-side artificial intelligence (AI) applications, due in part to their flexible and configurable heterogeneous architectures. However, the complex deployment process hinders the realization of AI democratization, particularly at the edge. In this paper, we propose a software-hardware co-design framework that enables simultaneous searching for neural network architectures and corresponding accelerator designs on embedded FPGAs. The proposed framework comprises a hardware-friendly neural architecture search space, a reconfigurable streaming-based accelerator architecture, and a model performance estimator. An evolutionary algorithm targeting multi-objective optimization is employed to identify the optimal neural architecture and corresponding accelerator design. We evaluate our framework on various datasets and demonstrate that, in a typical edge AI scenario, the searched network and accelerator can achieve up to a 2.9% accuracy improvement and up to a 21 speedup compared to manually designed networks based on× common accelerator designs when deployed on a widely used embedded FPGA (Xilinx XC7Z020).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)

自引率

0.00%

发文量