ETBench：描述跨边缘设备的混合视觉变压器工作负载

IF 3.6 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computers Pub Date : 2025-02-19 DOI:10.1109/TC.2025.3543697

Yingkun Zhou;Zhengshuyuan Tian;Wenhao Yang;Tingting Zhang;Jinpeng Ye;Chenji Han;Tianyi Liu;Fuxin Zhang

{"title":"ETBench：描述跨边缘设备的混合视觉变压器工作负载","authors":"Yingkun Zhou;Zhengshuyuan Tian;Wenhao Yang;Tingting Zhang;Jinpeng Ye;Chenji Han;Tianyi Liu;Fuxin Zhang","doi":"10.1109/TC.2025.3543697","DOIUrl":null,"url":null,"abstract":"Lightweight Convolution and Vision Transformer hybrid models have increasingly dominated the frontiers of deep learning (DL) on edge devices; however, to the best of our knowledge, no prior work has provided comprehensive evaluation on hybrid models’ performance and analyzed their characteristics by diving deep into the edge ecosystem with diversified modern DL inference engines and heterogeneous hardware. This paper proposes a comprehensive open-source benchmark suite, <bold>ETBench</b>, to allow power-efficiency, performance and accuracy assessment for state-of-the-art (SOTA) hybrid models across 11 most widely-used DL engines deployed on diverse edge devices. After building ETBench that satisfies 6 design requirements proposed in our work, we conduct extensive experiments on 14 devices including 19 CPUs, 11 GPUs and 5 NPUs, and obtain benchmark results from all deployment scenarios (combinations of models, quantization formats, software engines, and hardware platforms). Valuable observations and insightful implications are finally summarized. For example, within current DL engines, the INT8 quantization is significantly underperformed in terms of accuracy and speed against FP16 for hybrid models. Overall, ETBench serves as a collaborative platform that assists model architects in better evaluating their models and makes it possible for future co-optimizations of DL engines and hardware accelerators.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 6","pages":"1857-1871"},"PeriodicalIF":3.6000,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ETBench: Characterizing Hybrid Vision Transformer Workloads Across Edge Devices\",\"authors\":\"Yingkun Zhou;Zhengshuyuan Tian;Wenhao Yang;Tingting Zhang;Jinpeng Ye;Chenji Han;Tianyi Liu;Fuxin Zhang\",\"doi\":\"10.1109/TC.2025.3543697\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Lightweight Convolution and Vision Transformer hybrid models have increasingly dominated the frontiers of deep learning (DL) on edge devices; however, to the best of our knowledge, no prior work has provided comprehensive evaluation on hybrid models’ performance and analyzed their characteristics by diving deep into the edge ecosystem with diversified modern DL inference engines and heterogeneous hardware. This paper proposes a comprehensive open-source benchmark suite, <bold>ETBench</b>, to allow power-efficiency, performance and accuracy assessment for state-of-the-art (SOTA) hybrid models across 11 most widely-used DL engines deployed on diverse edge devices. After building ETBench that satisfies 6 design requirements proposed in our work, we conduct extensive experiments on 14 devices including 19 CPUs, 11 GPUs and 5 NPUs, and obtain benchmark results from all deployment scenarios (combinations of models, quantization formats, software engines, and hardware platforms). Valuable observations and insightful implications are finally summarized. For example, within current DL engines, the INT8 quantization is significantly underperformed in terms of accuracy and speed against FP16 for hybrid models. Overall, ETBench serves as a collaborative platform that assists model architects in better evaluating their models and makes it possible for future co-optimizations of DL engines and hardware accelerators.\",\"PeriodicalId\":13087,\"journal\":{\"name\":\"IEEE Transactions on Computers\",\"volume\":\"74 6\",\"pages\":\"1857-1871\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computers\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10892343/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10892343/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

轻量级卷积和视觉转换器混合模型在边缘设备上的深度学习（DL）领域日益占据主导地位；然而，据我们所知，之前没有任何工作通过深入研究边缘生态系统，使用多样化的现代深度学习推理引擎和异构硬件，对混合模型的性能进行全面评估，并分析其特征。本文提出了一个全面的开源基准套件ETBench，允许在部署在各种边缘设备上的11种最广泛使用的深度学习引擎中对最先进的（SOTA）混合模型进行能效、性能和准确性评估。在构建了满足我们工作中提出的6项设计要求的ETBench之后，我们在包括19个cpu、11个gpu和5个npu在内的14个设备上进行了广泛的实验，并获得了所有部署场景（模型组合、量化格式、软件引擎和硬件平台）的基准测试结果。最后总结了有价值的观察和深刻的启示。例如，在当前的深度学习引擎中，INT8量化在混合模型的精度和速度方面明显落后于FP16。总的来说，ETBench作为一个协作平台，帮助模型架构师更好地评估他们的模型，并使未来的深度学习引擎和硬件加速器的协同优化成为可能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

ETBench: Characterizing Hybrid Vision Transformer Workloads Across Edge Devices

Lightweight Convolution and Vision Transformer hybrid models have increasingly dominated the frontiers of deep learning (DL) on edge devices; however, to the best of our knowledge, no prior work has provided comprehensive evaluation on hybrid models’ performance and analyzed their characteristics by diving deep into the edge ecosystem with diversified modern DL inference engines and heterogeneous hardware. This paper proposes a comprehensive open-source benchmark suite, ETBench, to allow power-efficiency, performance and accuracy assessment for state-of-the-art (SOTA) hybrid models across 11 most widely-used DL engines deployed on diverse edge devices. After building ETBench that satisfies 6 design requirements proposed in our work, we conduct extensive experiments on 14 devices including 19 CPUs, 11 GPUs and 5 NPUs, and obtain benchmark results from all deployment scenarios (combinations of models, quantization formats, software engines, and hardware platforms). Valuable observations and insightful implications are finally summarized. For example, within current DL engines, the INT8 quantization is significantly underperformed in terms of accuracy and speed against FP16 for hybrid models. Overall, ETBench serves as a collaborative platform that assists model architects in better evaluating their models and makes it possible for future co-optimizations of DL engines and hardware accelerators.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Computers 工程技术-工程：电子与电气

CiteScore

6.60

自引率

5.40%

发文量

199

审稿时长

6.0 months

期刊介绍： The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.