{"title":"JointCS:异构物联网设备上深度模型压缩和分割的联合搜索","authors":"Xinyu Li, Bin Guo, Sicong Liu, Chen Qiu, Yunji Liang, Zhiwen Yu","doi":"10.1109/ICPADS53394.2021.00059","DOIUrl":null,"url":null,"abstract":"Deep neural networks (DNNs) play an important role in a variety of intelligent applications (e.g. image classification and target recognition), yet at the cost of heavy computation burden, that makes DNNs difficult to deploy on resource-constrained IoT devices. To solve this problem, there are two categories of model computation adjustment methods: model compression and model segmentation. However, model compression mainly reduces resource consumption at the cost of accuracy while model segmentation reduces resource consumption according to the cost of communication latency. In this paper, we propose Joint Search for Model Compression and Segmentation (JointCS) that highlights the following aspects: 1) we integrate both model compression and model segmentation under an automatic and progressive framework, it simplifies model to fit the different IoT resource requirements. JointCS achieves a series slim models that outperform better both in accuracy and latency. 2) we train a network architecture-aware latency predictor to fast measure the latency of the slimed model on heterogeneous IoT devices. 3) we introduce a search algorithm to select the optimal state in progressively joint search. Finally, we evaluate the performance of our proposed method for image classification on CIFAR datasets comparing with the state-of-the-art approach, the inference time of the proposed method has inference speedup of 12.2 % −30.9 % under the same accuracy.","PeriodicalId":309508,"journal":{"name":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"2 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"JointCS: Joint Search for Deep Model Compression and Segmentation on Heterogeneous IoT Devices\",\"authors\":\"Xinyu Li, Bin Guo, Sicong Liu, Chen Qiu, Yunji Liang, Zhiwen Yu\",\"doi\":\"10.1109/ICPADS53394.2021.00059\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural networks (DNNs) play an important role in a variety of intelligent applications (e.g. image classification and target recognition), yet at the cost of heavy computation burden, that makes DNNs difficult to deploy on resource-constrained IoT devices. To solve this problem, there are two categories of model computation adjustment methods: model compression and model segmentation. However, model compression mainly reduces resource consumption at the cost of accuracy while model segmentation reduces resource consumption according to the cost of communication latency. In this paper, we propose Joint Search for Model Compression and Segmentation (JointCS) that highlights the following aspects: 1) we integrate both model compression and model segmentation under an automatic and progressive framework, it simplifies model to fit the different IoT resource requirements. JointCS achieves a series slim models that outperform better both in accuracy and latency. 2) we train a network architecture-aware latency predictor to fast measure the latency of the slimed model on heterogeneous IoT devices. 3) we introduce a search algorithm to select the optimal state in progressively joint search. Finally, we evaluate the performance of our proposed method for image classification on CIFAR datasets comparing with the state-of-the-art approach, the inference time of the proposed method has inference speedup of 12.2 % −30.9 % under the same accuracy.\",\"PeriodicalId\":309508,\"journal\":{\"name\":\"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)\",\"volume\":\"2 1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPADS53394.2021.00059\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPADS53394.2021.00059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
JointCS: Joint Search for Deep Model Compression and Segmentation on Heterogeneous IoT Devices
Deep neural networks (DNNs) play an important role in a variety of intelligent applications (e.g. image classification and target recognition), yet at the cost of heavy computation burden, that makes DNNs difficult to deploy on resource-constrained IoT devices. To solve this problem, there are two categories of model computation adjustment methods: model compression and model segmentation. However, model compression mainly reduces resource consumption at the cost of accuracy while model segmentation reduces resource consumption according to the cost of communication latency. In this paper, we propose Joint Search for Model Compression and Segmentation (JointCS) that highlights the following aspects: 1) we integrate both model compression and model segmentation under an automatic and progressive framework, it simplifies model to fit the different IoT resource requirements. JointCS achieves a series slim models that outperform better both in accuracy and latency. 2) we train a network architecture-aware latency predictor to fast measure the latency of the slimed model on heterogeneous IoT devices. 3) we introduce a search algorithm to select the optimal state in progressively joint search. Finally, we evaluate the performance of our proposed method for image classification on CIFAR datasets comparing with the state-of-the-art approach, the inference time of the proposed method has inference speedup of 12.2 % −30.9 % under the same accuracy.