“网络内集成”:多元知识蒸馏的深度集成学习

ACM Transactions on Intelligent Systems and Technology (TIST) Pub Date : 2021-10-31 DOI:10.1145/3473464

Xingjian Li, Haoyi Xiong, Zeyu Chen, Jun Huan, Chengzhong Xu, D. Dou

{"title":"“网络内集成”:多元知识蒸馏的深度集成学习","authors":"Xingjian Li, Haoyi Xiong, Zeyu Chen, Jun Huan, Chengzhong Xu, D. Dou","doi":"10.1145/3473464","DOIUrl":null,"url":null,"abstract":"Ensemble learning is a widely used technique to train deep convolutional neural networks (CNNs) for improved robustness and accuracy. While existing algorithms usually first train multiple diversified networks and then assemble these networks as an aggregated classifier, we propose a novel learning paradigm, namely, “In-Network Ensemble” (INE) that incorporates the diversity of multiple models through training a SINGLE deep neural network. Specifically, INE segments the outputs of the CNN into multiple independent classifiers, where each classifier is further fine-tuned with better accuracy through a so-called diversified knowledge distillation process. We then aggregate the fine-tuned independent classifiers using an Averaging-and-Softmax operator to obtain the final ensemble classifier. Note that, in the supervised learning settings, INE starts the CNN training from random, while, under the transfer learning settings, it also could start with a pre-trained model to incorporate the knowledge learned from additional datasets. Extensive experiments have been done using eight large-scale real-world datasets, including CIFAR, ImageNet, and Stanford Cars, among others, as well as common deep network architectures such as VGG, ResNet, and Wide ResNet. We have evaluated the method under two tasks: supervised learning and transfer learning. The results show that INE outperforms the state-of-the-art algorithms for deep ensemble learning with improved accuracy.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"“In-Network Ensemble”: Deep Ensemble Learning with Diversified Knowledge Distillation\",\"authors\":\"Xingjian Li, Haoyi Xiong, Zeyu Chen, Jun Huan, Chengzhong Xu, D. Dou\",\"doi\":\"10.1145/3473464\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ensemble learning is a widely used technique to train deep convolutional neural networks (CNNs) for improved robustness and accuracy. While existing algorithms usually first train multiple diversified networks and then assemble these networks as an aggregated classifier, we propose a novel learning paradigm, namely, “In-Network Ensemble” (INE) that incorporates the diversity of multiple models through training a SINGLE deep neural network. Specifically, INE segments the outputs of the CNN into multiple independent classifiers, where each classifier is further fine-tuned with better accuracy through a so-called diversified knowledge distillation process. We then aggregate the fine-tuned independent classifiers using an Averaging-and-Softmax operator to obtain the final ensemble classifier. Note that, in the supervised learning settings, INE starts the CNN training from random, while, under the transfer learning settings, it also could start with a pre-trained model to incorporate the knowledge learned from additional datasets. Extensive experiments have been done using eight large-scale real-world datasets, including CIFAR, ImageNet, and Stanford Cars, among others, as well as common deep network architectures such as VGG, ResNet, and Wide ResNet. We have evaluated the method under two tasks: supervised learning and transfer learning. The results show that INE outperforms the state-of-the-art algorithms for deep ensemble learning with improved accuracy.\",\"PeriodicalId\":123526,\"journal\":{\"name\":\"ACM Transactions on Intelligent Systems and Technology (TIST)\",\"volume\":\"62 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Intelligent Systems and Technology (TIST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3473464\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Intelligent Systems and Technology (TIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3473464","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

集成学习是一种广泛应用于深度卷积神经网络(cnn)训练的技术，以提高其鲁棒性和准确性。现有算法通常首先训练多个多样化的网络，然后将这些网络组装为聚合分类器，而我们提出了一种新的学习范式，即“网络集成”(In-Network Ensemble, INE)，它通过训练单个深度神经网络来融合多个模型的多样性。具体来说，INE将CNN的输出分割成多个独立的分类器，其中每个分类器通过所谓的多元化知识蒸馏过程进一步微调以获得更好的精度。然后，我们使用average -and- softmax操作符聚合经过微调的独立分类器，以获得最终的集成分类器。注意，在监督学习设置下，INE从随机开始CNN训练，而在迁移学习设置下，它也可以从预训练的模型开始，以纳入从其他数据集学习到的知识。广泛的实验使用了八个大规模的真实世界数据集，包括CIFAR、ImageNet和Stanford Cars等，以及常见的深度网络架构，如VGG、ResNet和Wide ResNet。我们在监督学习和迁移学习两个任务下对该方法进行了评估。结果表明，INE优于最先进的深度集成学习算法，精度更高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

“In-Network Ensemble”: Deep Ensemble Learning with Diversified Knowledge Distillation

Ensemble learning is a widely used technique to train deep convolutional neural networks (CNNs) for improved robustness and accuracy. While existing algorithms usually first train multiple diversified networks and then assemble these networks as an aggregated classifier, we propose a novel learning paradigm, namely, “In-Network Ensemble” (INE) that incorporates the diversity of multiple models through training a SINGLE deep neural network. Specifically, INE segments the outputs of the CNN into multiple independent classifiers, where each classifier is further fine-tuned with better accuracy through a so-called diversified knowledge distillation process. We then aggregate the fine-tuned independent classifiers using an Averaging-and-Softmax operator to obtain the final ensemble classifier. Note that, in the supervised learning settings, INE starts the CNN training from random, while, under the transfer learning settings, it also could start with a pre-trained model to incorporate the knowledge learned from additional datasets. Extensive experiments have been done using eight large-scale real-world datasets, including CIFAR, ImageNet, and Stanford Cars, among others, as well as common deep network architectures such as VGG, ResNet, and Wide ResNet. We have evaluated the method under two tasks: supervised learning and transfer learning. The results show that INE outperforms the state-of-the-art algorithms for deep ensemble learning with improved accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Intelligent Systems and Technology (TIST)

自引率

0.00%

发文量