Wang Tang;Linbo Qing;Pingyu Wang;Lindong Li;Ce Zhu
{"title":"Hypergraph Mamba Reasoning-Based Social Relation Recognition","authors":"Wang Tang;Linbo Qing;Pingyu Wang;Lindong Li;Ce Zhu","doi":"10.1109/TIP.2025.3592551","DOIUrl":null,"url":null,"abstract":"Recognizing social relations from images is crucial for improving machine perception of social interactions. Current studies mainly focus on exploring single-type relation reasoning frameworks, such as the relation between father, mother and son in a family. However, real-world scenarios often involve complex hybrid relations, such as friendships and professional relations, which pose a challenge for current methods due to the difficulty of establishing robust logical connections between these relations. In fact, in this hybrid social relation recognition setting, the interactions extend beyond dyadic to multipartite structures. To effectively explore these multipartite interactions, we propose a novel Hypergraph Mamba (HGM) framework. Specifically, we construct two hypergraphs, i.e., Person-Person Hypergraphs (PPH) and Person-Object Hypergraphs (POH), to model these high-order multipartite interactions. The HGM module performs social relation reasoning within these hypergraph structures, which includes a Vertex Selection Algorithm to mitigate inference confusion by filtering out confounders, and a Vertex Interaction Operator to find optimal global vertex neighborhoods by capturing long-range vertex dependencies. In addition, a Multilevel Transformer is proposed to adaptively align the PPH and POH inferred knowledge and visual signals to facilitate information fusion. We validate the effectiveness of our proposed HGM model on several public datasets and perform extensive ablation studies to elucidate the reasons contributing to its superior performance. Experimental results indicate that our HGM model achieves superior accuracy in predicting social relations compared to the state-of-the-art methods. Codes and datasets are available at: <uri>https://github.com/tw-repository/HGM-SRR</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"4814-4829"},"PeriodicalIF":13.7000,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11104994/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recognizing social relations from images is crucial for improving machine perception of social interactions. Current studies mainly focus on exploring single-type relation reasoning frameworks, such as the relation between father, mother and son in a family. However, real-world scenarios often involve complex hybrid relations, such as friendships and professional relations, which pose a challenge for current methods due to the difficulty of establishing robust logical connections between these relations. In fact, in this hybrid social relation recognition setting, the interactions extend beyond dyadic to multipartite structures. To effectively explore these multipartite interactions, we propose a novel Hypergraph Mamba (HGM) framework. Specifically, we construct two hypergraphs, i.e., Person-Person Hypergraphs (PPH) and Person-Object Hypergraphs (POH), to model these high-order multipartite interactions. The HGM module performs social relation reasoning within these hypergraph structures, which includes a Vertex Selection Algorithm to mitigate inference confusion by filtering out confounders, and a Vertex Interaction Operator to find optimal global vertex neighborhoods by capturing long-range vertex dependencies. In addition, a Multilevel Transformer is proposed to adaptively align the PPH and POH inferred knowledge and visual signals to facilitate information fusion. We validate the effectiveness of our proposed HGM model on several public datasets and perform extensive ablation studies to elucidate the reasons contributing to its superior performance. Experimental results indicate that our HGM model achieves superior accuracy in predicting social relations compared to the state-of-the-art methods. Codes and datasets are available at: https://github.com/tw-repository/HGM-SRR