Deeply-Supervised Knowledge Synergy

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2019-06-01 DOI:10.1109/CVPR.2019.00716

Dawei Sun, Anbang Yao, Aojun Zhou, Hao Zhao

{"title":"Deeply-Supervised Knowledge Synergy","authors":"Dawei Sun, Anbang Yao, Aojun Zhou, Hao Zhao","doi":"10.1109/CVPR.2019.00716","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Networks (CNNs) have become deeper and more complicated compared with the pioneering AlexNet. However, current prevailing training scheme follows the previous way of adding supervision to the last layer of the network only and propagating error information up layer-by-layer. In this paper, we propose Deeply-supervised Knowledge Synergy (DKS), a new method aiming to train CNNs with improved generalization ability for image classification tasks without introducing extra computational cost during inference. Inspired by the deeply-supervised learning scheme, we first append auxiliary supervision branches on top of certain intermediate network layers. While properly using auxiliary supervision can improve model accuracy to some degree, we go one step further to explore the possibility of utilizing the probabilistic knowledge dynamically learnt by the classifiers connected to the backbone network as a new regularization to improve the training. A novel synergy loss, which considers pairwise knowledge matching among all supervision branches, is presented. Intriguingly, it enables dense pairwise knowledge matching operations in both top-down and bottom-up directions at each training iteration, resembling a dynamic synergy process for the same task. We evaluate DKS on image classification datasets using state-of-the-art CNN architectures, and show that the models trained with it are consistently better than the corresponding counterparts. For instance, on the ImageNet classification benchmark, our ResNet-152 model outperforms the baseline model with a 1.47% margin in Top-1 accuracy. Code is available at https://github.com/sundw2014/DKS.","PeriodicalId":6711,"journal":{"name":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"61 1","pages":"6990-6999"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2019.00716","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 46

Abstract

Convolutional Neural Networks (CNNs) have become deeper and more complicated compared with the pioneering AlexNet. However, current prevailing training scheme follows the previous way of adding supervision to the last layer of the network only and propagating error information up layer-by-layer. In this paper, we propose Deeply-supervised Knowledge Synergy (DKS), a new method aiming to train CNNs with improved generalization ability for image classification tasks without introducing extra computational cost during inference. Inspired by the deeply-supervised learning scheme, we first append auxiliary supervision branches on top of certain intermediate network layers. While properly using auxiliary supervision can improve model accuracy to some degree, we go one step further to explore the possibility of utilizing the probabilistic knowledge dynamically learnt by the classifiers connected to the backbone network as a new regularization to improve the training. A novel synergy loss, which considers pairwise knowledge matching among all supervision branches, is presented. Intriguingly, it enables dense pairwise knowledge matching operations in both top-down and bottom-up directions at each training iteration, resembling a dynamic synergy process for the same task. We evaluate DKS on image classification datasets using state-of-the-art CNN architectures, and show that the models trained with it are consistently better than the corresponding counterparts. For instance, on the ImageNet classification benchmark, our ResNet-152 model outperforms the baseline model with a 1.47% margin in Top-1 accuracy. Code is available at https://github.com/sundw2014/DKS.

查看原文本刊更多论文

深度监督知识协同

与先驱AlexNet相比，卷积神经网络(cnn)已经变得更加深入和复杂。然而，目前流行的训练方案沿用了之前的方法，即只在网络的最后一层增加监督，逐层传播错误信息。在本文中，我们提出了深度监督知识协同(deep -supervised Knowledge Synergy, DKS)，这是一种新的方法，旨在训练具有更高泛化能力的cnn来完成图像分类任务，而不会在推理过程中引入额外的计算成本。受深度监督学习方案的启发，我们首先在某些中间网络层上附加辅助监督分支。适当地使用辅助监督可以在一定程度上提高模型的准确性，我们进一步探索了利用连接到骨干网络的分类器动态学习的概率知识作为一种新的正则化来改进训练的可能性。提出了一种考虑各监管分支间知识配对匹配的新型协同损失算法。有趣的是，在每个训练迭代中，它支持自上而下和自下而上方向上的密集成对知识匹配操作，类似于同一任务的动态协同过程。我们使用最先进的CNN架构在图像分类数据集上评估DKS，并表明使用它训练的模型始终优于相应的模型。例如，在ImageNet分类基准上，我们的ResNet-152模型在Top-1准确率上比基线模型高出1.47%。代码可从https://github.com/sundw2014/DKS获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量