{"title":"Compression of fully-connected layer in neural network by Kronecker product","authors":"Jia-Nan Wu","doi":"10.1109/ICACI.2016.7449822","DOIUrl":null,"url":null,"abstract":"In this paper we propose and study a technique to reduce the number of parameters in fully-connected layers of neural networks using Kronecker product, at a mild cost of the prediction quality. The technique proceeds by replacing fully-connected layers with so-called Kronecker fully-connected layers, where the weight matrices of the fully-connected layers are approximated by linear combinations of multiple Kronecker products of smaller matrices. Just as the Kronecker product is a generalization of the outer product from vectors to matrices, our method is a generalization of the low rank approximation method for fully-connected layers. We also use combinations of different shapes of Kronecker product to increase modelling capacity. Experiments on SVHN, scene text recognition and ImageNet dataset demonstrate that we can achieve 10x reduction of number of parameters with less than 1% drop in accuracy, showing the effectiveness and efficiency of our method.","PeriodicalId":211040,"journal":{"name":"2016 Eighth International Conference on Advanced Computational Intelligence (ICACI)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Eighth International Conference on Advanced Computational Intelligence (ICACI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACI.2016.7449822","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20
Abstract
In this paper we propose and study a technique to reduce the number of parameters in fully-connected layers of neural networks using Kronecker product, at a mild cost of the prediction quality. The technique proceeds by replacing fully-connected layers with so-called Kronecker fully-connected layers, where the weight matrices of the fully-connected layers are approximated by linear combinations of multiple Kronecker products of smaller matrices. Just as the Kronecker product is a generalization of the outer product from vectors to matrices, our method is a generalization of the low rank approximation method for fully-connected layers. We also use combinations of different shapes of Kronecker product to increase modelling capacity. Experiments on SVHN, scene text recognition and ImageNet dataset demonstrate that we can achieve 10x reduction of number of parameters with less than 1% drop in accuracy, showing the effectiveness and efficiency of our method.