Yuan Gao , Yaochen Li , Yujie Zang , Jingze Liu , Yuehu Liu
{"title":"Prototype-based multi-domain self-distillation for unbiased scene graph generation","authors":"Yuan Gao , Yaochen Li , Yujie Zang , Jingze Liu , Yuehu Liu","doi":"10.1016/j.neucom.2025.131625","DOIUrl":null,"url":null,"abstract":"<div><div>Scene Graph Generation (SGG) plays an important role in reinforcing visual image understanding. Existing methods often encounter difficulties in effectively representing implicit relationship features, which limits their capacity to distinguish between predicates. Meanwhile, these approaches are susceptible to imbalanced instance distributions, hindering the efficient training of fine-grained predicates. To address these problems, we propose a novel prototype-based multi-domain self-distillation training framework. Specifically, a Multi-Domain Fusion (MDF) module is introduced to improve predicate feature representation by integrating global contextual information and local spatial-frequency domain information. Then, a Prototype Generation Network (PGN) is designed for building the class prototypes, which consists of the design of different granularity predicates and loss functions. Furthermore, we design two different data balancing strategies under the guidance of class prototypes, which correspond to mining the in-distribution and out-of-distribution information of the original data, respectively. The experimental results demonstrate that the proposed method is superior to the existing methods on VG, GQA and Open Images V6 datasets, which makes it more applicable to generating unbiased scene graph models.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"658 ","pages":"Article 131625"},"PeriodicalIF":6.5000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225022970","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Scene Graph Generation (SGG) plays an important role in reinforcing visual image understanding. Existing methods often encounter difficulties in effectively representing implicit relationship features, which limits their capacity to distinguish between predicates. Meanwhile, these approaches are susceptible to imbalanced instance distributions, hindering the efficient training of fine-grained predicates. To address these problems, we propose a novel prototype-based multi-domain self-distillation training framework. Specifically, a Multi-Domain Fusion (MDF) module is introduced to improve predicate feature representation by integrating global contextual information and local spatial-frequency domain information. Then, a Prototype Generation Network (PGN) is designed for building the class prototypes, which consists of the design of different granularity predicates and loss functions. Furthermore, we design two different data balancing strategies under the guidance of class prototypes, which correspond to mining the in-distribution and out-of-distribution information of the original data, respectively. The experimental results demonstrate that the proposed method is superior to the existing methods on VG, GQA and Open Images V6 datasets, which makes it more applicable to generating unbiased scene graph models.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.