Anhui Tan , Yu Wang , Wei-Zhi Wu , Weiping Ding , Jiye Liang
{"title":"Multi-View Fusion Graph Attention Network for Multilabel Class Incremental Learning","authors":"Anhui Tan , Yu Wang , Wei-Zhi Wu , Weiping Ding , Jiye Liang","doi":"10.1016/j.inffus.2025.103309","DOIUrl":null,"url":null,"abstract":"<div><div>Multilabel Class-Incremental Learning (MLCIL) refers to a variant of class-incremental learning and multilabel learning where models are required to learn from images or data associated with multiple labels, and new sets of classes are introduced incrementally. However, most existing MLCIL methods tend to rely heavily on limited single-view features, which makes it challenging for them to effectively capture class-specific characteristics and the correlations between different labels. Furthermore, MLCIL faces difficulties related to both intra-class and inter-class imbalances, which arise from the varying frequencies of class occurrences during each incremental session. To address these issues, we propose a novel MLCIL model called the Multi-View Fusion Graph Attention Network (MVGAT). First, the MVGAT architecture includes a multi-view feature extraction module that fuses class node features from three different perspectives of images, effectively capturing both local and global class-specific information. Second, MVGAT introduces a multi-view attention fusion module that combines the multi-view class node features based on label correlations. Importantly, the attention fusion modules trained in previous learning sessions are preserved, helping to mitigate catastrophic forgetting by providing independent probability predictions for their respective learned classes. Additionally, MVGAT is equipped with a pseudo-label correction module to enhance the accuracy of pseudo-labels by integrating predictions from the current session with those from historical frozen attention fusion modules. Moreover, an asymmetric loss function has been developed to balance intra-class and inter-class performance by dynamically adjusting negative focus parameters based on class occurrence frequency. Finally, experimental results on benchmark datasets demonstrate that MVGAT outperforms existing state-of-the-art methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103309"},"PeriodicalIF":14.7000,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525003823","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Multilabel Class-Incremental Learning (MLCIL) refers to a variant of class-incremental learning and multilabel learning where models are required to learn from images or data associated with multiple labels, and new sets of classes are introduced incrementally. However, most existing MLCIL methods tend to rely heavily on limited single-view features, which makes it challenging for them to effectively capture class-specific characteristics and the correlations between different labels. Furthermore, MLCIL faces difficulties related to both intra-class and inter-class imbalances, which arise from the varying frequencies of class occurrences during each incremental session. To address these issues, we propose a novel MLCIL model called the Multi-View Fusion Graph Attention Network (MVGAT). First, the MVGAT architecture includes a multi-view feature extraction module that fuses class node features from three different perspectives of images, effectively capturing both local and global class-specific information. Second, MVGAT introduces a multi-view attention fusion module that combines the multi-view class node features based on label correlations. Importantly, the attention fusion modules trained in previous learning sessions are preserved, helping to mitigate catastrophic forgetting by providing independent probability predictions for their respective learned classes. Additionally, MVGAT is equipped with a pseudo-label correction module to enhance the accuracy of pseudo-labels by integrating predictions from the current session with those from historical frozen attention fusion modules. Moreover, an asymmetric loss function has been developed to balance intra-class and inter-class performance by dynamically adjusting negative focus parameters based on class occurrence frequency. Finally, experimental results on benchmark datasets demonstrate that MVGAT outperforms existing state-of-the-art methods.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.