{"title":"An Automatic Scheme for Optimizing the Size of Deep Networks","authors":"Wenting Ma, Zhipeng Zhang, Qingqing Xu, Wai Chen","doi":"10.1145/3432291.3432293","DOIUrl":null,"url":null,"abstract":"Large-scale datasets and complex architectures promote the development of CNN models. Although with stronger representation power, larger CNNs are more resource-hungry, which makes it difficult to deploy on resource-constrained Internet of Things (IoT) devices. Another serious challenge occurring with larger CNNs is their susceptibility to overfit with a small training dataset. In this paper, we ask the question: can we find an optimized compact model for a particular data set? We propose a novel scheme to optimize the model size so as to obtain a compact model instead of a larger model. The optimized model achieves higher accuracy than the widely used deeper model. In addition, it decreases the run-time memory and reduces the number of computing operations. This is achieved by applying Minimum Description Length (MDL) to find the optimal size of the model for a particular data set mathematically. MDL--the information-theoretic model selection principle assumes that the simplest, most compact representation model is the best model and most probable explanation of the data. We call our approach OptSize, model size is automatically identified, yielding compact models with comparable accuracy. We empirically demonstrate the effectiveness of our approach with several state-of-the-art CNN models, including VGGNet, ResNet on various image classification datasets. The result shows that compact nets obtained by our proposed method perform better to complex, well-engineered, deeper convolutional architectures.","PeriodicalId":126684,"journal":{"name":"Proceedings of the 2020 3rd International Conference on Signal Processing and Machine Learning","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 3rd International Conference on Signal Processing and Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3432291.3432293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Large-scale datasets and complex architectures promote the development of CNN models. Although with stronger representation power, larger CNNs are more resource-hungry, which makes it difficult to deploy on resource-constrained Internet of Things (IoT) devices. Another serious challenge occurring with larger CNNs is their susceptibility to overfit with a small training dataset. In this paper, we ask the question: can we find an optimized compact model for a particular data set? We propose a novel scheme to optimize the model size so as to obtain a compact model instead of a larger model. The optimized model achieves higher accuracy than the widely used deeper model. In addition, it decreases the run-time memory and reduces the number of computing operations. This is achieved by applying Minimum Description Length (MDL) to find the optimal size of the model for a particular data set mathematically. MDL--the information-theoretic model selection principle assumes that the simplest, most compact representation model is the best model and most probable explanation of the data. We call our approach OptSize, model size is automatically identified, yielding compact models with comparable accuracy. We empirically demonstrate the effectiveness of our approach with several state-of-the-art CNN models, including VGGNet, ResNet on various image classification datasets. The result shows that compact nets obtained by our proposed method perform better to complex, well-engineered, deeper convolutional architectures.