边缘计算环境下变压器推理加速

Mingchu Li, Wenteng Zhang, Dexin Xia
{"title":"边缘计算环境下变压器推理加速","authors":"Mingchu Li, Wenteng Zhang, Dexin Xia","doi":"10.1109/CCGridW59191.2023.00030","DOIUrl":null,"url":null,"abstract":"The rapid development of deep neural networks (DNNs) has provided a strong foundation for the popularization of intelligent applications. However, the limited computing power of IoT devices cannot support the massive computing load of DNNs. Traditional cloud computing solutions suffer from distance and bandwidth constraints, and cannot meet latency-sensitive requirements. Prior research on DNN acceleration has predominantly focused on convolutional neural networks (CNNs), Transformer has recently gained significant popularity due to its outstanding performance in natural language processing, image processing, and other domains. In this context, we have explored the acceleration of Transformer in edge environments. To model Vision Transformer, we have employed the design concept of a multi-branch network and proposed an optimization strategy to accelerate Transformer inference in the edge environment. We have evaluated our approach on three publicly available datasets and demonstrated its superior performance in challenging network conditions when compared to existing mainstream DNN collaborative acceleration inference techniques.","PeriodicalId":341115,"journal":{"name":"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transformer Inference Acceleration in Edge Computing Environment\",\"authors\":\"Mingchu Li, Wenteng Zhang, Dexin Xia\",\"doi\":\"10.1109/CCGridW59191.2023.00030\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rapid development of deep neural networks (DNNs) has provided a strong foundation for the popularization of intelligent applications. However, the limited computing power of IoT devices cannot support the massive computing load of DNNs. Traditional cloud computing solutions suffer from distance and bandwidth constraints, and cannot meet latency-sensitive requirements. Prior research on DNN acceleration has predominantly focused on convolutional neural networks (CNNs), Transformer has recently gained significant popularity due to its outstanding performance in natural language processing, image processing, and other domains. In this context, we have explored the acceleration of Transformer in edge environments. To model Vision Transformer, we have employed the design concept of a multi-branch network and proposed an optimization strategy to accelerate Transformer inference in the edge environment. We have evaluated our approach on three publicly available datasets and demonstrated its superior performance in challenging network conditions when compared to existing mainstream DNN collaborative acceleration inference techniques.\",\"PeriodicalId\":341115,\"journal\":{\"name\":\"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCGridW59191.2023.00030\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGridW59191.2023.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

深度神经网络(deep neural networks, dnn)的快速发展为智能应用的普及提供了坚实的基础。然而,物联网设备有限的计算能力无法支持dnn的海量计算负载。传统的云计算解决方案受到距离和带宽的限制,无法满足对延迟敏感的需求。先前对深度神经网络加速的研究主要集中在卷积神经网络(cnn)上,Transformer由于其在自然语言处理、图像处理和其他领域的出色表现,最近获得了显著的普及。在此背景下,我们探讨了Transformer在边缘环境中的加速。为了对Vision Transformer进行建模,我们采用了多分支网络的设计理念,并提出了一种优化策略来加速边缘环境下Transformer的推理。我们已经在三个公开可用的数据集上评估了我们的方法,并与现有的主流DNN协同加速推理技术相比,证明了它在具有挑战性的网络条件下的优越性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Transformer Inference Acceleration in Edge Computing Environment
The rapid development of deep neural networks (DNNs) has provided a strong foundation for the popularization of intelligent applications. However, the limited computing power of IoT devices cannot support the massive computing load of DNNs. Traditional cloud computing solutions suffer from distance and bandwidth constraints, and cannot meet latency-sensitive requirements. Prior research on DNN acceleration has predominantly focused on convolutional neural networks (CNNs), Transformer has recently gained significant popularity due to its outstanding performance in natural language processing, image processing, and other domains. In this context, we have explored the acceleration of Transformer in edge environments. To model Vision Transformer, we have employed the design concept of a multi-branch network and proposed an optimization strategy to accelerate Transformer inference in the edge environment. We have evaluated our approach on three publicly available datasets and demonstrated its superior performance in challenging network conditions when compared to existing mainstream DNN collaborative acceleration inference techniques.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信