针对长尾视觉识别的异构知识转移动态协作学习

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion Pub Date : 2024-10-15 DOI:10.1016/j.inffus.2024.102734

Hao Zhou , Tingjin Luo , Yongming He

{"title":"针对长尾视觉识别的异构知识转移动态协作学习","authors":"Hao Zhou , Tingjin Luo , Yongming He","doi":"10.1016/j.inffus.2024.102734","DOIUrl":null,"url":null,"abstract":"<div><div>Solving the long-tailed visual recognition with deep convolutional neural networks is still a challenging task. As a mainstream method, multi-experts models achieve SOTA accuracy for tackling this problem, but the uncertainty in network learning and the complexity in fusion inference constrain the performance and practicality of the multi-experts models. To remedy this, we propose a novel dynamic collaborative learning with heterogeneous knowledge transfer model (DCHKT) in this paper, in which experts with different expertise collaborate to make predictions. DCHKT consists of two core components: dynamic adaptive weight adjustment and heterogeneous knowledge transfer learning. First, the dynamic adaptive weight adjustment is designed to shift the focus of model training between the global expert and domain experts via dynamic adaptive weight. By modulating the trade-off between the learning of features and classifier, the dynamic adaptive weight adjustment can enhance the discriminative ability of each expert and alleviate the uncertainty of model learning. Then, heterogeneous knowledge transfer learning, which measures the distribution differences between the fusion logits of multiple experts and the predicted logits of each expert with different specialties, can achieve message passing between experts and enhance the consistency of ensemble prediction in model training and inference to promote their collaborations. Finally, extensive experimental results on public long-tailed datasets: CIFAR-LT, ImageNet-LT, Place-LT and iNaturalist2018, demonstrate the effectiveness and superiority of our DCHKT.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102734"},"PeriodicalIF":14.7000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dynamic collaborative learning with heterogeneous knowledge transfer for long-tailed visual recognition\",\"authors\":\"Hao Zhou , Tingjin Luo , Yongming He\",\"doi\":\"10.1016/j.inffus.2024.102734\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Solving the long-tailed visual recognition with deep convolutional neural networks is still a challenging task. As a mainstream method, multi-experts models achieve SOTA accuracy for tackling this problem, but the uncertainty in network learning and the complexity in fusion inference constrain the performance and practicality of the multi-experts models. To remedy this, we propose a novel dynamic collaborative learning with heterogeneous knowledge transfer model (DCHKT) in this paper, in which experts with different expertise collaborate to make predictions. DCHKT consists of two core components: dynamic adaptive weight adjustment and heterogeneous knowledge transfer learning. First, the dynamic adaptive weight adjustment is designed to shift the focus of model training between the global expert and domain experts via dynamic adaptive weight. By modulating the trade-off between the learning of features and classifier, the dynamic adaptive weight adjustment can enhance the discriminative ability of each expert and alleviate the uncertainty of model learning. Then, heterogeneous knowledge transfer learning, which measures the distribution differences between the fusion logits of multiple experts and the predicted logits of each expert with different specialties, can achieve message passing between experts and enhance the consistency of ensemble prediction in model training and inference to promote their collaborations. Finally, extensive experimental results on public long-tailed datasets: CIFAR-LT, ImageNet-LT, Place-LT and iNaturalist2018, demonstrate the effectiveness and superiority of our DCHKT.</div></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"115 \",\"pages\":\"Article 102734\"},\"PeriodicalIF\":14.7000,\"publicationDate\":\"2024-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1566253524005128\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524005128","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

利用深度卷积神经网络解决长尾视觉识别问题仍然是一项具有挑战性的任务。作为一种主流方法，多专家模型在解决这一问题时可以达到 SOTA 的精度，但网络学习的不确定性和融合推理的复杂性限制了多专家模型的性能和实用性。为了解决这一问题，我们在本文中提出了一种新颖的异构知识转移动态协作学习模型（DCHKT），在该模型中，具有不同专业知识的专家共同协作进行预测。DCHKT 由两个核心部分组成：动态自适应权重调整和异构知识转移学习。首先，动态自适应权重调整旨在通过动态自适应权重在全局专家和领域专家之间转移模型训练的重点。通过调节特征学习和分类器学习之间的权衡，动态自适应权重调整可以增强每位专家的判别能力，缓解模型学习的不确定性。然后，异质知识转移学习通过测量多位专家的融合对数与每位专家不同专业预测对数之间的分布差异，实现专家间的信息传递，增强模型训练和推理中集合预测的一致性，促进专家间的合作。最后，在公共长尾数据集上取得了大量实验结果：最后，在 CIFAR-LT、ImageNet-LT、Place-LT 和 iNaturalist2018 等公共长尾数据集上的大量实验结果证明了我们的 DCHKT 的有效性和优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Dynamic collaborative learning with heterogeneous knowledge transfer for long-tailed visual recognition

查看原文本刊更多论文

Dynamic collaborative learning with heterogeneous knowledge transfer for long-tailed visual recognition

Solving the long-tailed visual recognition with deep convolutional neural networks is still a challenging task. As a mainstream method, multi-experts models achieve SOTA accuracy for tackling this problem, but the uncertainty in network learning and the complexity in fusion inference constrain the performance and practicality of the multi-experts models. To remedy this, we propose a novel dynamic collaborative learning with heterogeneous knowledge transfer model (DCHKT) in this paper, in which experts with different expertise collaborate to make predictions. DCHKT consists of two core components: dynamic adaptive weight adjustment and heterogeneous knowledge transfer learning. First, the dynamic adaptive weight adjustment is designed to shift the focus of model training between the global expert and domain experts via dynamic adaptive weight. By modulating the trade-off between the learning of features and classifier, the dynamic adaptive weight adjustment can enhance the discriminative ability of each expert and alleviate the uncertainty of model learning. Then, heterogeneous knowledge transfer learning, which measures the distribution differences between the fusion logits of multiple experts and the predicted logits of each expert with different specialties, can achieve message passing between experts and enhance the consistency of ensemble prediction in model training and inference to promote their collaborations. Finally, extensive experimental results on public long-tailed datasets: CIFAR-LT, ImageNet-LT, Place-LT and iNaturalist2018, demonstrate the effectiveness and superiority of our DCHKT.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.