Yi Xie;Hanxiao Wu;Jianqing Zhu;Huanqiang Zeng;Jing Zhang
{"title":"Expanding and Refining Hybrid Compressors for Efficient Object Re-Identification","authors":"Yi Xie;Hanxiao Wu;Jianqing Zhu;Huanqiang Zeng;Jing Zhang","doi":"10.1109/TIP.2024.3410684","DOIUrl":null,"url":null,"abstract":"Recent object re-identification (Re-ID) methods gain high efficiency via lightweight student models trained by knowledge distillation (KD). However, the huge architectural difference between lightweight students and heavy teachers causes students to have difficulties in receiving and understanding teachers’ knowledge, thus losing certain accuracy. To this end, we propose a refiner-expander-refiner (RER) structure to enlarge a student’s representational capacity and prune the student’s complexity. The expander is a multi-branch convolutional layer to expand the student’s representational capacity to understand a teacher’s knowledge comprehensively, which does not require any feature-dimensional adapter to avoid knowledge distortions. The two refiners are \n<inline-formula> <tex-math>$1\\times 1$ </tex-math></inline-formula>\n convolutional layers to prune the input and output channels of the expander. In addition, in order to alleviate the competition accuracy-related and pruning-related gradients, we design a common consensus gradient resetting (CCGR) method, which discards unimportant channels according to the intersection of each sample’s unimportant channel judgment. Finally, the trained RER can be simplified into a slim convolutional layer via re-parameterization to speed up inference. As a result, we propose an expanding and refining hybrid compressing (ERHC) method. Extensive experiments show that our ERHC has superior inference speed and accuracy, e.g., on the VeRi-776 dataset, given the ResNet101 as a teacher, ERHC saves 75.33% model parameters (MP) and 74.29% floating-point of operations (FLOPs) without sacrificing accuracy.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10555505/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recent object re-identification (Re-ID) methods gain high efficiency via lightweight student models trained by knowledge distillation (KD). However, the huge architectural difference between lightweight students and heavy teachers causes students to have difficulties in receiving and understanding teachers’ knowledge, thus losing certain accuracy. To this end, we propose a refiner-expander-refiner (RER) structure to enlarge a student’s representational capacity and prune the student’s complexity. The expander is a multi-branch convolutional layer to expand the student’s representational capacity to understand a teacher’s knowledge comprehensively, which does not require any feature-dimensional adapter to avoid knowledge distortions. The two refiners are
$1\times 1$
convolutional layers to prune the input and output channels of the expander. In addition, in order to alleviate the competition accuracy-related and pruning-related gradients, we design a common consensus gradient resetting (CCGR) method, which discards unimportant channels according to the intersection of each sample’s unimportant channel judgment. Finally, the trained RER can be simplified into a slim convolutional layer via re-parameterization to speed up inference. As a result, we propose an expanding and refining hybrid compressing (ERHC) method. Extensive experiments show that our ERHC has superior inference speed and accuracy, e.g., on the VeRi-776 dataset, given the ResNet101 as a teacher, ERHC saves 75.33% model parameters (MP) and 74.29% floating-point of operations (FLOPs) without sacrificing accuracy.