边缘计算环境下变压器推理加速

2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW) Pub Date : 2023-05-01 DOI:10.1109/CCGridW59191.2023.00030

Mingchu Li, Wenteng Zhang, Dexin Xia

{"title":"边缘计算环境下变压器推理加速","authors":"Mingchu Li, Wenteng Zhang, Dexin Xia","doi":"10.1109/CCGridW59191.2023.00030","DOIUrl":null,"url":null,"abstract":"The rapid development of deep neural networks (DNNs) has provided a strong foundation for the popularization of intelligent applications. However, the limited computing power of IoT devices cannot support the massive computing load of DNNs. Traditional cloud computing solutions suffer from distance and bandwidth constraints, and cannot meet latency-sensitive requirements. Prior research on DNN acceleration has predominantly focused on convolutional neural networks (CNNs), Transformer has recently gained significant popularity due to its outstanding performance in natural language processing, image processing, and other domains. In this context, we have explored the acceleration of Transformer in edge environments. To model Vision Transformer, we have employed the design concept of a multi-branch network and proposed an optimization strategy to accelerate Transformer inference in the edge environment. We have evaluated our approach on three publicly available datasets and demonstrated its superior performance in challenging network conditions when compared to existing mainstream DNN collaborative acceleration inference techniques.","PeriodicalId":341115,"journal":{"name":"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transformer Inference Acceleration in Edge Computing Environment\",\"authors\":\"Mingchu Li, Wenteng Zhang, Dexin Xia\",\"doi\":\"10.1109/CCGridW59191.2023.00030\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rapid development of deep neural networks (DNNs) has provided a strong foundation for the popularization of intelligent applications. However, the limited computing power of IoT devices cannot support the massive computing load of DNNs. Traditional cloud computing solutions suffer from distance and bandwidth constraints, and cannot meet latency-sensitive requirements. Prior research on DNN acceleration has predominantly focused on convolutional neural networks (CNNs), Transformer has recently gained significant popularity due to its outstanding performance in natural language processing, image processing, and other domains. In this context, we have explored the acceleration of Transformer in edge environments. To model Vision Transformer, we have employed the design concept of a multi-branch network and proposed an optimization strategy to accelerate Transformer inference in the edge environment. We have evaluated our approach on three publicly available datasets and demonstrated its superior performance in challenging network conditions when compared to existing mainstream DNN collaborative acceleration inference techniques.\",\"PeriodicalId\":341115,\"journal\":{\"name\":\"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCGridW59191.2023.00030\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGridW59191.2023.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

深度神经网络(deep neural networks, dnn)的快速发展为智能应用的普及提供了坚实的基础。然而，物联网设备有限的计算能力无法支持dnn的海量计算负载。传统的云计算解决方案受到距离和带宽的限制，无法满足对延迟敏感的需求。先前对深度神经网络加速的研究主要集中在卷积神经网络(cnn)上，Transformer由于其在自然语言处理、图像处理和其他领域的出色表现，最近获得了显著的普及。在此背景下，我们探讨了Transformer在边缘环境中的加速。为了对Vision Transformer进行建模，我们采用了多分支网络的设计理念，并提出了一种优化策略来加速边缘环境下Transformer的推理。我们已经在三个公开可用的数据集上评估了我们的方法，并与现有的主流DNN协同加速推理技术相比，证明了它在具有挑战性的网络条件下的优越性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Transformer Inference Acceleration in Edge Computing Environment

The rapid development of deep neural networks (DNNs) has provided a strong foundation for the popularization of intelligent applications. However, the limited computing power of IoT devices cannot support the massive computing load of DNNs. Traditional cloud computing solutions suffer from distance and bandwidth constraints, and cannot meet latency-sensitive requirements. Prior research on DNN acceleration has predominantly focused on convolutional neural networks (CNNs), Transformer has recently gained significant popularity due to its outstanding performance in natural language processing, image processing, and other domains. In this context, we have explored the acceleration of Transformer in edge environments. To model Vision Transformer, we have employed the design concept of a multi-branch network and proposed an optimization strategy to accelerate Transformer inference in the edge environment. We have evaluated our approach on three publicly available datasets and demonstrated its superior performance in challenging network conditions when compared to existing mainstream DNN collaborative acceleration inference techniques.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)

自引率

0.00%

发文量