{"title":"边缘计算环境下变压器推理加速","authors":"Mingchu Li, Wenteng Zhang, Dexin Xia","doi":"10.1109/CCGridW59191.2023.00030","DOIUrl":null,"url":null,"abstract":"The rapid development of deep neural networks (DNNs) has provided a strong foundation for the popularization of intelligent applications. However, the limited computing power of IoT devices cannot support the massive computing load of DNNs. Traditional cloud computing solutions suffer from distance and bandwidth constraints, and cannot meet latency-sensitive requirements. Prior research on DNN acceleration has predominantly focused on convolutional neural networks (CNNs), Transformer has recently gained significant popularity due to its outstanding performance in natural language processing, image processing, and other domains. In this context, we have explored the acceleration of Transformer in edge environments. To model Vision Transformer, we have employed the design concept of a multi-branch network and proposed an optimization strategy to accelerate Transformer inference in the edge environment. We have evaluated our approach on three publicly available datasets and demonstrated its superior performance in challenging network conditions when compared to existing mainstream DNN collaborative acceleration inference techniques.","PeriodicalId":341115,"journal":{"name":"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transformer Inference Acceleration in Edge Computing Environment\",\"authors\":\"Mingchu Li, Wenteng Zhang, Dexin Xia\",\"doi\":\"10.1109/CCGridW59191.2023.00030\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rapid development of deep neural networks (DNNs) has provided a strong foundation for the popularization of intelligent applications. However, the limited computing power of IoT devices cannot support the massive computing load of DNNs. Traditional cloud computing solutions suffer from distance and bandwidth constraints, and cannot meet latency-sensitive requirements. Prior research on DNN acceleration has predominantly focused on convolutional neural networks (CNNs), Transformer has recently gained significant popularity due to its outstanding performance in natural language processing, image processing, and other domains. In this context, we have explored the acceleration of Transformer in edge environments. To model Vision Transformer, we have employed the design concept of a multi-branch network and proposed an optimization strategy to accelerate Transformer inference in the edge environment. We have evaluated our approach on three publicly available datasets and demonstrated its superior performance in challenging network conditions when compared to existing mainstream DNN collaborative acceleration inference techniques.\",\"PeriodicalId\":341115,\"journal\":{\"name\":\"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCGridW59191.2023.00030\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGridW59191.2023.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Transformer Inference Acceleration in Edge Computing Environment
The rapid development of deep neural networks (DNNs) has provided a strong foundation for the popularization of intelligent applications. However, the limited computing power of IoT devices cannot support the massive computing load of DNNs. Traditional cloud computing solutions suffer from distance and bandwidth constraints, and cannot meet latency-sensitive requirements. Prior research on DNN acceleration has predominantly focused on convolutional neural networks (CNNs), Transformer has recently gained significant popularity due to its outstanding performance in natural language processing, image processing, and other domains. In this context, we have explored the acceleration of Transformer in edge environments. To model Vision Transformer, we have employed the design concept of a multi-branch network and proposed an optimization strategy to accelerate Transformer inference in the edge environment. We have evaluated our approach on three publicly available datasets and demonstrated its superior performance in challenging network conditions when compared to existing mainstream DNN collaborative acceleration inference techniques.