Multi-label image classification using adaptive graph convolutional networks: From a single domain to multiple domains

IF 4.3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computer Vision and Image Understanding Pub Date : 2024-07-01 DOI:10.1016/j.cviu.2024.104062

Inder Pal Singh , Enjie Ghorbel , Oyebade Oyedotun , Djamila Aouada

{"title":"Multi-label image classification using adaptive graph convolutional networks: From a single domain to multiple domains","authors":"Inder Pal Singh , Enjie Ghorbel , Oyebade Oyedotun , Djamila Aouada","doi":"10.1016/j.cviu.2024.104062","DOIUrl":null,"url":null,"abstract":"<div><p>This paper proposes an adaptive graph-based approach for multi-label image classification. Graph-based methods have been largely exploited in the field of multi-label classification, given their ability to model label correlations. Specifically, their effectiveness has been proven not only when considering a single domain but also when taking into account multiple domains. However, the topology of the used graph is not optimal as it is pre-defined heuristically. In addition, consecutive Graph Convolutional Network (GCN) aggregations tend to destroy the feature similarity. To overcome these issues, an architecture for learning the graph connectivity in an end-to-end fashion is introduced. This is done by integrating an attention-based mechanism and a similarity-preserving strategy. The proposed framework is then extended to multiple domains using an adversarial training scheme. Numerous experiments are reported on well-known single-domain and multi-domain benchmarks. The results demonstrate that our approach achieves competitive results in terms of mean Average Precision (mAP) and model size as compared to the state-of-the-art. The code will be made publicly available.</p></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1077314224001437/pdfft?md5=0c261f58e8fe19e830f04f80492395f1&pid=1-s2.0-S1077314224001437-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314224001437","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

This paper proposes an adaptive graph-based approach for multi-label image classification. Graph-based methods have been largely exploited in the field of multi-label classification, given their ability to model label correlations. Specifically, their effectiveness has been proven not only when considering a single domain but also when taking into account multiple domains. However, the topology of the used graph is not optimal as it is pre-defined heuristically. In addition, consecutive Graph Convolutional Network (GCN) aggregations tend to destroy the feature similarity. To overcome these issues, an architecture for learning the graph connectivity in an end-to-end fashion is introduced. This is done by integrating an attention-based mechanism and a similarity-preserving strategy. The proposed framework is then extended to multiple domains using an adversarial training scheme. Numerous experiments are reported on well-known single-domain and multi-domain benchmarks. The results demonstrate that our approach achieves competitive results in terms of mean Average Precision (mAP) and model size as compared to the state-of-the-art. The code will be made publicly available.

查看原文本刊更多论文

利用自适应图卷积网络进行多标签图像分类：从单域到多域

本文提出了一种基于图的自适应多标签图像分类方法。基于图的方法具有标签相关性建模能力，因此在多标签分类领域得到了广泛应用。具体来说，这些方法不仅在考虑单个领域时有效，在考虑多个领域时也同样有效。然而，所使用的图的拓扑结构并不是最佳的，因为它是预先启发式定义的。此外，连续的图卷积网络（GCN）聚合往往会破坏特征的相似性。为了克服这些问题，我们引入了一种以端到端方式学习图连接性的架构。这是通过整合基于注意力的机制和保持相似性的策略来实现的。然后，利用对抗训练方案将所提出的框架扩展到多个领域。报告在著名的单域和多域基准上进行了大量实验。结果表明，与最先进的方法相比，我们的方法在平均精度（mAP）和模型大小方面都取得了有竞争力的结果。代码将公开发布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Vision and Image Understanding 工程技术-工程：电子与电气

CiteScore

7.80

自引率

4.40%

发文量

112

审稿时长

79 days

期刊介绍： The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views. Research Areas Include: • Theory • Early vision • Data structures and representations • Shape • Range • Motion • Matching and recognition • Architecture and languages • Vision systems