{"title":"Combinatorial action recognition based on causal segment intervention","authors":"Xiaozhou Sun","doi":"10.1117/12.3014465","DOIUrl":null,"url":null,"abstract":"Combinatorial action recognition has recently attracted the attention of researchers in the field of computer vision. It focuses on the effective representation and discrimination of spatio-temporal interactions occurring between different actions and objects in video data. Existing work tends to strengthen the framework's object recognition capabilities and relationship modeling capabilities, e.g., attention mechanisms, and graph structures. We find that existing algorithms can be influenced by interaction-independent video segments in a video, misleading the algorithm to focus on additional information in the vision. For the algorithm to analyze the spatio-temporal interactions of causally related video segments in a video, a Causal Slice Recognition Network (CSRN) is proposed. This method can effectively remove the interference of video background segments by explicitly recognizing and extracting the causally related segments in the video. We validate the method on the Something-else dataset and obtain the best results.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"252 1","pages":"129692W - 129692W-6"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3014465","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Combinatorial action recognition has recently attracted the attention of researchers in the field of computer vision. It focuses on the effective representation and discrimination of spatio-temporal interactions occurring between different actions and objects in video data. Existing work tends to strengthen the framework's object recognition capabilities and relationship modeling capabilities, e.g., attention mechanisms, and graph structures. We find that existing algorithms can be influenced by interaction-independent video segments in a video, misleading the algorithm to focus on additional information in the vision. For the algorithm to analyze the spatio-temporal interactions of causally related video segments in a video, a Causal Slice Recognition Network (CSRN) is proposed. This method can effectively remove the interference of video background segments by explicitly recognizing and extracting the causally related segments in the video. We validate the method on the Something-else dataset and obtain the best results.