Guoliang Xu, Jianqin Yin, Shaojie Zhang, Moonjun Gong
{"title":"MLP-AIR: An effective MLP-based module for actor interaction relation learning in group activity recognition","authors":"Guoliang Xu, Jianqin Yin, Shaojie Zhang, Moonjun Gong","doi":"10.1016/j.knosys.2024.112453","DOIUrl":null,"url":null,"abstract":"<div><p>Modeling actor interaction relations is crucial for group activity recognition. Previous approaches often adopt a fixed paradigm that involves calculating an affinity matrix to model these interaction relations, yielding significant performance. On the one hand, the affinity matrix introduces an inductive bias that actor interaction relations should be dynamically computed based on the input actor features. On the other hand, MLPs with static parameterization, in which parameters are fixed after training, can represent arbitrary functions. Therefore, it is an open question whether inductive bias is necessary for modeling actor interaction relations. To explore the impact of this inductive bias, we propose an affinity matrix-free paradigm that directly uses the MLP with static parameterization to model actor interaction relations. We term this approach MLP-AIR. This paradigm overcomes the limitations of the inductive bias and enhances the capture of implicit actor interaction relations. Specifically, MLP-AIR consists of two sub-modules: the MLP-based Interaction relation modeling module (MLP-I) and the MLP-based Relation refining module (MLP-R). MLP-I is used to model the spatial–temporal interaction relations by emphasizing cross-actor and cross-frame feature learning. Meanwhile, MLP-R is used to refine the relation between different channels of each relation feature, thereby enhancing the expression ability of the features. MLP-AIR is a plug-and-play module. To evaluate our module, we applied MLP-AIR to replicate three representative methods. We conducted extensive experiments on two widely used benchmarks—the Volleyball and Collective Activity datasets. The experiments demonstrate that MLP-AIR achieves favorable results. The code is available at <span><span>https://github.com/Xuguoliang12/MLP-AIR</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124010876","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Modeling actor interaction relations is crucial for group activity recognition. Previous approaches often adopt a fixed paradigm that involves calculating an affinity matrix to model these interaction relations, yielding significant performance. On the one hand, the affinity matrix introduces an inductive bias that actor interaction relations should be dynamically computed based on the input actor features. On the other hand, MLPs with static parameterization, in which parameters are fixed after training, can represent arbitrary functions. Therefore, it is an open question whether inductive bias is necessary for modeling actor interaction relations. To explore the impact of this inductive bias, we propose an affinity matrix-free paradigm that directly uses the MLP with static parameterization to model actor interaction relations. We term this approach MLP-AIR. This paradigm overcomes the limitations of the inductive bias and enhances the capture of implicit actor interaction relations. Specifically, MLP-AIR consists of two sub-modules: the MLP-based Interaction relation modeling module (MLP-I) and the MLP-based Relation refining module (MLP-R). MLP-I is used to model the spatial–temporal interaction relations by emphasizing cross-actor and cross-frame feature learning. Meanwhile, MLP-R is used to refine the relation between different channels of each relation feature, thereby enhancing the expression ability of the features. MLP-AIR is a plug-and-play module. To evaluate our module, we applied MLP-AIR to replicate three representative methods. We conducted extensive experiments on two widely used benchmarks—the Volleyball and Collective Activity datasets. The experiments demonstrate that MLP-AIR achieves favorable results. The code is available at https://github.com/Xuguoliang12/MLP-AIR.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.