Jianwei Hao, Piyush Subedi, I. Kim, Lakshmish Ramaswamy
{"title":"基于深度学习推理的边缘设备资源异构特征","authors":"Jianwei Hao, Piyush Subedi, I. Kim, Lakshmish Ramaswamy","doi":"10.1145/3452411.3464446","DOIUrl":null,"url":null,"abstract":"Significant advances in hardware capabilities and the availability of enormous data sets have led to the rise and penetration of artificial intelligence (AI) and deep learning (DL) in various domains. Considerable efforts have been put forth in academia and industry to make these computationally demanding DL tasks work on resource-constrained edge devices. However, performing DL tasks on edge devices is still challenging due to the diversity of DNN (Deep Neural Networks) architectures and heterogeneity of edge devices. This study evaluates and characterizes the performance and resource heterogeneity in various edge devices for performing DL tasks. We benchmark various DNN models for image classification on a set of edge devices ranging from the widely popular and relatively less powerful Raspberry Pi to GPU-equipped high-performance edge devices like Jetson Xavier NX. We also compare and contrast the performance of three widely-used DL frameworks when used in these edge devices. We report DL inference throughput, CPU and memory usage, power consumption, and frameworks' initialization overhead, which are the most critical factors for characterizing DL tasks on edge devices. Additionally, we provide our insights and findings, which will provide a better idea of how compatible or feasible edge devices are for running DL applications.","PeriodicalId":339207,"journal":{"name":"Proceedings of the 2021 on Systems and Network Telemetry and Analytics","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Characterizing Resource Heterogeneity in Edge Devices for Deep Learning Inferences\",\"authors\":\"Jianwei Hao, Piyush Subedi, I. Kim, Lakshmish Ramaswamy\",\"doi\":\"10.1145/3452411.3464446\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Significant advances in hardware capabilities and the availability of enormous data sets have led to the rise and penetration of artificial intelligence (AI) and deep learning (DL) in various domains. Considerable efforts have been put forth in academia and industry to make these computationally demanding DL tasks work on resource-constrained edge devices. However, performing DL tasks on edge devices is still challenging due to the diversity of DNN (Deep Neural Networks) architectures and heterogeneity of edge devices. This study evaluates and characterizes the performance and resource heterogeneity in various edge devices for performing DL tasks. We benchmark various DNN models for image classification on a set of edge devices ranging from the widely popular and relatively less powerful Raspberry Pi to GPU-equipped high-performance edge devices like Jetson Xavier NX. We also compare and contrast the performance of three widely-used DL frameworks when used in these edge devices. We report DL inference throughput, CPU and memory usage, power consumption, and frameworks' initialization overhead, which are the most critical factors for characterizing DL tasks on edge devices. Additionally, we provide our insights and findings, which will provide a better idea of how compatible or feasible edge devices are for running DL applications.\",\"PeriodicalId\":339207,\"journal\":{\"name\":\"Proceedings of the 2021 on Systems and Network Telemetry and Analytics\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2021 on Systems and Network Telemetry and Analytics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3452411.3464446\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 on Systems and Network Telemetry and Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3452411.3464446","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Characterizing Resource Heterogeneity in Edge Devices for Deep Learning Inferences
Significant advances in hardware capabilities and the availability of enormous data sets have led to the rise and penetration of artificial intelligence (AI) and deep learning (DL) in various domains. Considerable efforts have been put forth in academia and industry to make these computationally demanding DL tasks work on resource-constrained edge devices. However, performing DL tasks on edge devices is still challenging due to the diversity of DNN (Deep Neural Networks) architectures and heterogeneity of edge devices. This study evaluates and characterizes the performance and resource heterogeneity in various edge devices for performing DL tasks. We benchmark various DNN models for image classification on a set of edge devices ranging from the widely popular and relatively less powerful Raspberry Pi to GPU-equipped high-performance edge devices like Jetson Xavier NX. We also compare and contrast the performance of three widely-used DL frameworks when used in these edge devices. We report DL inference throughput, CPU and memory usage, power consumption, and frameworks' initialization overhead, which are the most critical factors for characterizing DL tasks on edge devices. Additionally, we provide our insights and findings, which will provide a better idea of how compatible or feasible edge devices are for running DL applications.