Mariana S. M. Barbosa, R. G. Pacheco, R. S. Couto, Dianne S. V. Medeiros, M. Campista
{"title":"决策早退出:一种加速分支网络卸载的有效方法","authors":"Mariana S. M. Barbosa, R. G. Pacheco, R. S. Couto, Dianne S. V. Medeiros, M. Campista","doi":"10.1109/LATINCOM56090.2022.10000430","DOIUrl":null,"url":null,"abstract":"Many works study partitioning and early exits in Deep Neural Networks (DNNs) to improve the inference time. Early exits allow the inference of samples in advance, based on the fact that some features are learned at DNNs’ initial layers. However, usage of early exits can slightly decrease performance. Partitioning enables the shallowest part of the model to reside at the edge while the deeper layers reside in the cloud. Deciding whether samples must be sent to the cloud at each early exit is time consuming, increasing the total inference time. Hence, reducing this time while maintaining the model performance is currently an open challenge. In this paper, we propose a Decision Early Exit (DEEx), implemented at the first early exit, aiming to reduce the total inference time by skipping unnecessary evaluations at early exits, which may not able to improve the model’s performance. To this end, the DEEx compares a predefined decision threshold with the prediction confidence level for each sample and decides whether the sample must be offloaded. We assess DEEx through a comparative analysis that investigates the influence of different values for the decision threshold on the inference time. Our results show that there is a cost benefit between the inference time and the threshold. Using DEEx in a simulated BranchyNet, we can reduce the inference time by around 20% while maintaining the same accuracy achieved when the samples are offloaded.","PeriodicalId":221354,"journal":{"name":"2022 IEEE Latin-American Conference on Communications (LATINCOM)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Decision Early-Exit: An Efficient Approach to Hasten Offloading in BranchyNets\",\"authors\":\"Mariana S. M. Barbosa, R. G. Pacheco, R. S. Couto, Dianne S. V. Medeiros, M. Campista\",\"doi\":\"10.1109/LATINCOM56090.2022.10000430\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many works study partitioning and early exits in Deep Neural Networks (DNNs) to improve the inference time. Early exits allow the inference of samples in advance, based on the fact that some features are learned at DNNs’ initial layers. However, usage of early exits can slightly decrease performance. Partitioning enables the shallowest part of the model to reside at the edge while the deeper layers reside in the cloud. Deciding whether samples must be sent to the cloud at each early exit is time consuming, increasing the total inference time. Hence, reducing this time while maintaining the model performance is currently an open challenge. In this paper, we propose a Decision Early Exit (DEEx), implemented at the first early exit, aiming to reduce the total inference time by skipping unnecessary evaluations at early exits, which may not able to improve the model’s performance. To this end, the DEEx compares a predefined decision threshold with the prediction confidence level for each sample and decides whether the sample must be offloaded. We assess DEEx through a comparative analysis that investigates the influence of different values for the decision threshold on the inference time. Our results show that there is a cost benefit between the inference time and the threshold. Using DEEx in a simulated BranchyNet, we can reduce the inference time by around 20% while maintaining the same accuracy achieved when the samples are offloaded.\",\"PeriodicalId\":221354,\"journal\":{\"name\":\"2022 IEEE Latin-American Conference on Communications (LATINCOM)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Latin-American Conference on Communications (LATINCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/LATINCOM56090.2022.10000430\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Latin-American Conference on Communications (LATINCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LATINCOM56090.2022.10000430","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Decision Early-Exit: An Efficient Approach to Hasten Offloading in BranchyNets
Many works study partitioning and early exits in Deep Neural Networks (DNNs) to improve the inference time. Early exits allow the inference of samples in advance, based on the fact that some features are learned at DNNs’ initial layers. However, usage of early exits can slightly decrease performance. Partitioning enables the shallowest part of the model to reside at the edge while the deeper layers reside in the cloud. Deciding whether samples must be sent to the cloud at each early exit is time consuming, increasing the total inference time. Hence, reducing this time while maintaining the model performance is currently an open challenge. In this paper, we propose a Decision Early Exit (DEEx), implemented at the first early exit, aiming to reduce the total inference time by skipping unnecessary evaluations at early exits, which may not able to improve the model’s performance. To this end, the DEEx compares a predefined decision threshold with the prediction confidence level for each sample and decides whether the sample must be offloaded. We assess DEEx through a comparative analysis that investigates the influence of different values for the decision threshold on the inference time. Our results show that there is a cost benefit between the inference time and the threshold. Using DEEx in a simulated BranchyNet, we can reduce the inference time by around 20% while maintaining the same accuracy achieved when the samples are offloaded.