{"title":"Queen Jane Approximately: Enabling Efficient Neural Network Inference with Context-Adaptivity","authors":"O. Machidon, Davor Sluga, V. Pejović","doi":"10.1145/3437984.3458833","DOIUrl":null,"url":null,"abstract":"Recent advances in deep learning allow on-demand reduction of model complexity, without a need for re-training, thus enabling a dynamic trade-off between the inference accuracy and the energy savings. Approximate mobile computing, on the other hand, adapts the computation approximation level as the context of usage, and consequently the computation needs or result accuracy needs, vary. In this work, we propose a synergy between the two directions and develop a context-aware method for dynamically adjusting the width of an on-device neural network based on the input and context-dependent classification confidence. We implement our method on a human activity recognition neural network and through measurements on a real-world embedded device demonstrate that such a network would save up to 37.8% energy and induce only 1% loss of accuracy, if used for continuous activity monitoring in the field of elderly care.","PeriodicalId":269840,"journal":{"name":"Proceedings of the 1st Workshop on Machine Learning and Systems","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st Workshop on Machine Learning and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3437984.3458833","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Recent advances in deep learning allow on-demand reduction of model complexity, without a need for re-training, thus enabling a dynamic trade-off between the inference accuracy and the energy savings. Approximate mobile computing, on the other hand, adapts the computation approximation level as the context of usage, and consequently the computation needs or result accuracy needs, vary. In this work, we propose a synergy between the two directions and develop a context-aware method for dynamically adjusting the width of an on-device neural network based on the input and context-dependent classification confidence. We implement our method on a human activity recognition neural network and through measurements on a real-world embedded device demonstrate that such a network would save up to 37.8% energy and induce only 1% loss of accuracy, if used for continuous activity monitoring in the field of elderly care.