{"title":"Spatiotemporal saliency and sub action segmentation for human action recognition","authors":"A. Babu, A. Shyna","doi":"10.1109/ICCCNT.2017.8204134","DOIUrl":null,"url":null,"abstract":"Human Action Recognition is a significant and challenging field of interest in Research and Industry. In this paper, the Selective Spatiotemporal Interest Points (Selective STIPs) are extracted from the input video and is labeled using a dictionary. The actions are segmented into sub-actions, and then the temporal and spatial structure is captured. The segmentation is done on the basis of interest point density. The spatial and temporal relationships between the labeled STIPs is represented using Space Salient and Time Salient directed graphs respectively. Time Salient pairwise feature (TSP) and Space Salient pairwise feature (SSP) is computed from corresponding directed graphs. The Selective STIP suppresses the background STIPs and detects more robust STIPs from the actors which improves performance of recognition. The Bag-of-Visual Words model combined with TSP and SSP for human action classification provides a more promising result.","PeriodicalId":6581,"journal":{"name":"2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT)","volume":"48 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCNT.2017.8204134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Human Action Recognition is a significant and challenging field of interest in Research and Industry. In this paper, the Selective Spatiotemporal Interest Points (Selective STIPs) are extracted from the input video and is labeled using a dictionary. The actions are segmented into sub-actions, and then the temporal and spatial structure is captured. The segmentation is done on the basis of interest point density. The spatial and temporal relationships between the labeled STIPs is represented using Space Salient and Time Salient directed graphs respectively. Time Salient pairwise feature (TSP) and Space Salient pairwise feature (SSP) is computed from corresponding directed graphs. The Selective STIP suppresses the background STIPs and detects more robust STIPs from the actors which improves performance of recognition. The Bag-of-Visual Words model combined with TSP and SSP for human action classification provides a more promising result.