Breaking the Edge: Enabling Efficient Neural Network Inference on Integrated Edge Devices

IF 5.3 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Feng Zhang;Chenyang Zhang;Jiawei Guan;Qiangjun Zhou;Kuangyu Chen;Xiao Zhang;Bingsheng He;Jidong Zhai;Xiaoyong Du
{"title":"Breaking the Edge: Enabling Efficient Neural Network Inference on Integrated Edge Devices","authors":"Feng Zhang;Chenyang Zhang;Jiawei Guan;Qiangjun Zhou;Kuangyu Chen;Xiao Zhang;Bingsheng He;Jidong Zhai;Xiaoyong Du","doi":"10.1109/TCC.2025.3559346","DOIUrl":null,"url":null,"abstract":"Edge computing has gained widespread attention in cloud computing due to the increasing demands of AIoT applications and the evolution of edge architectures. One prevalent application in this domain is neural network inference on edge for computing and processing. This article presents an in-depth exploration of inference on integrated edge devices and introduces EdgeNN, a groundbreaking solution for inference specifically designed for CPU-GPU integrated edge devices. EdgeNN offers three key innovations. First, EdgeNN adaptively employs <italic>zero-copy</i> optimization by harnessing unified physical memory. Second, EdgeNN introduces an innovative approach to CPU-GPU hybrid execution tailored for inference tasks. This technique enables concurrent CPU and GPU operation, effectively leveraging edge platforms’ computational capabilities. Third, EdgeNN adopts a finely tuned adaptive inference tuning technique that analyzes complex inference structures. It divides computations into sub-tasks, intelligently assigning them to the two processors for better performance. Experimental results demonstrate EdgeNN's superiority across six popular neural network inference processing. EdgeNN delivers average speed improvements of 3.97×, 4.10×, 3.12×, and 8.80× when compared to inference on four distinct edge CPUs. Furthermore, EdgeNN achieves significant time advantages compared to the direct execution of original programs. This improvement is attributed to better unified memory utilization (44.37%) and the innovative CPU-GPU hybrid execution approach (17.91%). Additionally, EdgeNN exhibits superior energy efficiency, providing 29.14× higher energy efficiency than edge CPUs and 5.70× higher energy efficiency than discrete GPUs. EdgeNN is now open source at <uri>https://github.com/ChenyangZhang-cs/EdgeNN</uri>.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 2","pages":"694-710"},"PeriodicalIF":5.3000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cloud Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10959707/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Edge computing has gained widespread attention in cloud computing due to the increasing demands of AIoT applications and the evolution of edge architectures. One prevalent application in this domain is neural network inference on edge for computing and processing. This article presents an in-depth exploration of inference on integrated edge devices and introduces EdgeNN, a groundbreaking solution for inference specifically designed for CPU-GPU integrated edge devices. EdgeNN offers three key innovations. First, EdgeNN adaptively employs zero-copy optimization by harnessing unified physical memory. Second, EdgeNN introduces an innovative approach to CPU-GPU hybrid execution tailored for inference tasks. This technique enables concurrent CPU and GPU operation, effectively leveraging edge platforms’ computational capabilities. Third, EdgeNN adopts a finely tuned adaptive inference tuning technique that analyzes complex inference structures. It divides computations into sub-tasks, intelligently assigning them to the two processors for better performance. Experimental results demonstrate EdgeNN's superiority across six popular neural network inference processing. EdgeNN delivers average speed improvements of 3.97×, 4.10×, 3.12×, and 8.80× when compared to inference on four distinct edge CPUs. Furthermore, EdgeNN achieves significant time advantages compared to the direct execution of original programs. This improvement is attributed to better unified memory utilization (44.37%) and the innovative CPU-GPU hybrid execution approach (17.91%). Additionally, EdgeNN exhibits superior energy efficiency, providing 29.14× higher energy efficiency than edge CPUs and 5.70× higher energy efficiency than discrete GPUs. EdgeNN is now open source at https://github.com/ChenyangZhang-cs/EdgeNN.
突破边缘:在集成边缘设备上实现高效神经网络推理
由于AIoT应用需求的增加和边缘架构的发展,边缘计算在云计算中得到了广泛的关注。在边缘计算和处理方面的神经网络推理是该领域的一个普遍应用。本文对集成边缘设备上的推理进行了深入的探索,并介绍了专为CPU-GPU集成边缘设备设计的突破性推理解决方案EdgeNN。EdgeNN提供了三个关键创新。首先,EdgeNN通过利用统一的物理内存自适应地采用零拷贝优化。其次,EdgeNN引入了一种创新的CPU-GPU混合执行方法,为推理任务量身定制。该技术支持并发CPU和GPU操作,有效利用边缘平台的计算能力。第三,EdgeNN采用精细自适应推理调优技术,分析复杂的推理结构。它将计算划分为子任务,智能地将它们分配给两个处理器以获得更好的性能。实验结果表明,EdgeNN在六种流行的神经网络推理处理中具有优势。与四个不同边缘cpu的推理相比,EdgeNN的平均速度提高了3.97倍、4.10倍、3.12倍和8.80倍。此外,与直接执行原始程序相比,EdgeNN具有显著的时间优势。这种改进归功于更好的统一内存利用率(44.37%)和创新的CPU-GPU混合执行方法(17.91%)。此外,EdgeNN具有卓越的能效,比边缘cpu的能效高29.14倍,比分立gpu的能效高5.70倍。EdgeNN现在是开源的,网址是https://github.com/ChenyangZhang-cs/EdgeNN。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Cloud Computing
IEEE Transactions on Cloud Computing Computer Science-Software
CiteScore
9.40
自引率
6.20%
发文量
167
期刊介绍: The IEEE Transactions on Cloud Computing (TCC) is dedicated to the multidisciplinary field of cloud computing. It is committed to the publication of articles that present innovative research ideas, application results, and case studies in cloud computing, focusing on key technical issues related to theory, algorithms, systems, applications, and performance.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信