{"title":"ETBench:描述跨边缘设备的混合视觉变压器工作负载","authors":"Yingkun Zhou;Zhengshuyuan Tian;Wenhao Yang;Tingting Zhang;Jinpeng Ye;Chenji Han;Tianyi Liu;Fuxin Zhang","doi":"10.1109/TC.2025.3543697","DOIUrl":null,"url":null,"abstract":"Lightweight Convolution and Vision Transformer hybrid models have increasingly dominated the frontiers of deep learning (DL) on edge devices; however, to the best of our knowledge, no prior work has provided comprehensive evaluation on hybrid models’ performance and analyzed their characteristics by diving deep into the edge ecosystem with diversified modern DL inference engines and heterogeneous hardware. This paper proposes a comprehensive open-source benchmark suite, <bold>ETBench</b>, to allow power-efficiency, performance and accuracy assessment for state-of-the-art (SOTA) hybrid models across 11 most widely-used DL engines deployed on diverse edge devices. After building ETBench that satisfies 6 design requirements proposed in our work, we conduct extensive experiments on 14 devices including 19 CPUs, 11 GPUs and 5 NPUs, and obtain benchmark results from all deployment scenarios (combinations of models, quantization formats, software engines, and hardware platforms). Valuable observations and insightful implications are finally summarized. For example, within current DL engines, the INT8 quantization is significantly underperformed in terms of accuracy and speed against FP16 for hybrid models. Overall, ETBench serves as a collaborative platform that assists model architects in better evaluating their models and makes it possible for future co-optimizations of DL engines and hardware accelerators.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 6","pages":"1857-1871"},"PeriodicalIF":3.6000,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ETBench: Characterizing Hybrid Vision Transformer Workloads Across Edge Devices\",\"authors\":\"Yingkun Zhou;Zhengshuyuan Tian;Wenhao Yang;Tingting Zhang;Jinpeng Ye;Chenji Han;Tianyi Liu;Fuxin Zhang\",\"doi\":\"10.1109/TC.2025.3543697\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Lightweight Convolution and Vision Transformer hybrid models have increasingly dominated the frontiers of deep learning (DL) on edge devices; however, to the best of our knowledge, no prior work has provided comprehensive evaluation on hybrid models’ performance and analyzed their characteristics by diving deep into the edge ecosystem with diversified modern DL inference engines and heterogeneous hardware. This paper proposes a comprehensive open-source benchmark suite, <bold>ETBench</b>, to allow power-efficiency, performance and accuracy assessment for state-of-the-art (SOTA) hybrid models across 11 most widely-used DL engines deployed on diverse edge devices. After building ETBench that satisfies 6 design requirements proposed in our work, we conduct extensive experiments on 14 devices including 19 CPUs, 11 GPUs and 5 NPUs, and obtain benchmark results from all deployment scenarios (combinations of models, quantization formats, software engines, and hardware platforms). Valuable observations and insightful implications are finally summarized. For example, within current DL engines, the INT8 quantization is significantly underperformed in terms of accuracy and speed against FP16 for hybrid models. Overall, ETBench serves as a collaborative platform that assists model architects in better evaluating their models and makes it possible for future co-optimizations of DL engines and hardware accelerators.\",\"PeriodicalId\":13087,\"journal\":{\"name\":\"IEEE Transactions on Computers\",\"volume\":\"74 6\",\"pages\":\"1857-1871\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computers\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10892343/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10892343/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
ETBench: Characterizing Hybrid Vision Transformer Workloads Across Edge Devices
Lightweight Convolution and Vision Transformer hybrid models have increasingly dominated the frontiers of deep learning (DL) on edge devices; however, to the best of our knowledge, no prior work has provided comprehensive evaluation on hybrid models’ performance and analyzed their characteristics by diving deep into the edge ecosystem with diversified modern DL inference engines and heterogeneous hardware. This paper proposes a comprehensive open-source benchmark suite, ETBench, to allow power-efficiency, performance and accuracy assessment for state-of-the-art (SOTA) hybrid models across 11 most widely-used DL engines deployed on diverse edge devices. After building ETBench that satisfies 6 design requirements proposed in our work, we conduct extensive experiments on 14 devices including 19 CPUs, 11 GPUs and 5 NPUs, and obtain benchmark results from all deployment scenarios (combinations of models, quantization formats, software engines, and hardware platforms). Valuable observations and insightful implications are finally summarized. For example, within current DL engines, the INT8 quantization is significantly underperformed in terms of accuracy and speed against FP16 for hybrid models. Overall, ETBench serves as a collaborative platform that assists model architects in better evaluating their models and makes it possible for future co-optimizations of DL engines and hardware accelerators.
期刊介绍:
The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.