{"title":"Mayo: A Framework for Auto-generating Hardware Friendly Deep Neural Networks","authors":"Yiren Zhao, Xitong Gao, R. Mullins, Chengzhong Xu","doi":"10.1145/3212725.3212726","DOIUrl":null,"url":null,"abstract":"Deep Neural Networks (DNNs) have proved to be a convenient and powerful tool for a wide range of problems. However, the extensive computational and memory resource requirements hinder the adoption of DNNs in resource-constrained scenarios. Existing compression methods have been shown to significantly reduce the computation and memory requirements of many popular DNNs. These methods, however, remain elusive to non-experts, as they demand extensive manual tuning of hyperparameters. The effects of combining various compression techniques lack exploration because of the large design space. To alleviate these challenges, this paper proposes an automated framework, Mayo, which is built on top of TensorFlow and can compress DNNs with minimal human intervention. First, we present overriders which are recursively-compositional and can be configured to effectively compress individual components (e.g. weights, biases, layer computations and gradients) in a DNN. Second, we introduce novel heuristics and a global search algorithm to efficiently optimize hyperparameters. We demonstrate that without any manual tuning, Mayo generates a sparse ResNet-18 that is 5.13x smaller than the baseline with no loss in test accuracy. By composing multiple overriders, our tool produces a sparse 6-bit CIFAR-10 classifier with only 0.16% top-1 accuracy loss and a 34x compression rate. Mayo and all compressed models are publicly available. To our knowledge, Mayo is the first framework that supports overlapping multiple compression techniques and automatically optimizes hyperparameters in them.","PeriodicalId":419019,"journal":{"name":"Proceedings of the 2nd International Workshop on Embedded and Mobile Deep Learning","volume":"2649 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Workshop on Embedded and Mobile Deep Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3212725.3212726","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Deep Neural Networks (DNNs) have proved to be a convenient and powerful tool for a wide range of problems. However, the extensive computational and memory resource requirements hinder the adoption of DNNs in resource-constrained scenarios. Existing compression methods have been shown to significantly reduce the computation and memory requirements of many popular DNNs. These methods, however, remain elusive to non-experts, as they demand extensive manual tuning of hyperparameters. The effects of combining various compression techniques lack exploration because of the large design space. To alleviate these challenges, this paper proposes an automated framework, Mayo, which is built on top of TensorFlow and can compress DNNs with minimal human intervention. First, we present overriders which are recursively-compositional and can be configured to effectively compress individual components (e.g. weights, biases, layer computations and gradients) in a DNN. Second, we introduce novel heuristics and a global search algorithm to efficiently optimize hyperparameters. We demonstrate that without any manual tuning, Mayo generates a sparse ResNet-18 that is 5.13x smaller than the baseline with no loss in test accuracy. By composing multiple overriders, our tool produces a sparse 6-bit CIFAR-10 classifier with only 0.16% top-1 accuracy loss and a 34x compression rate. Mayo and all compressed models are publicly available. To our knowledge, Mayo is the first framework that supports overlapping multiple compression techniques and automatically optimizes hyperparameters in them.