Wanru Xu, Z. Miao, Jian Zhang, Qiang Zhang, Haohao Wu
{"title":"结合置信度和贡献权重的动作识别时空背景","authors":"Wanru Xu, Z. Miao, Jian Zhang, Qiang Zhang, Haohao Wu","doi":"10.1109/ACPR.2013.114","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a new method for human action analysis in videos. A video sequence of human action in our perspective can be modeled through feature distribution over spatial-temporal domain. Relationships between features and each defined action are also explored to form discriminative feature sets. In our work, we first capture contextual correlations between the local features through multiple windows. We then mine confidences from association rules and learn contributions from trained-SVM based on sample videos. Finally, through the analysis of feature distribution and their interactions over spatial-temporal domain, we combine the contexture correlations and the relationships between words and their related actions to derive weights of bag of feature words for action matching. In most of the case, our experiments have indicated that the new method outperforms other previous published results on the Weizmann and KTH datasets.","PeriodicalId":365633,"journal":{"name":"2013 2nd IAPR Asian Conference on Pattern Recognition","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Spatial-Temporal Context for Action Recognition Combined with Confidence and Contribution Weight\",\"authors\":\"Wanru Xu, Z. Miao, Jian Zhang, Qiang Zhang, Haohao Wu\",\"doi\":\"10.1109/ACPR.2013.114\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a new method for human action analysis in videos. A video sequence of human action in our perspective can be modeled through feature distribution over spatial-temporal domain. Relationships between features and each defined action are also explored to form discriminative feature sets. In our work, we first capture contextual correlations between the local features through multiple windows. We then mine confidences from association rules and learn contributions from trained-SVM based on sample videos. Finally, through the analysis of feature distribution and their interactions over spatial-temporal domain, we combine the contexture correlations and the relationships between words and their related actions to derive weights of bag of feature words for action matching. In most of the case, our experiments have indicated that the new method outperforms other previous published results on the Weizmann and KTH datasets.\",\"PeriodicalId\":365633,\"journal\":{\"name\":\"2013 2nd IAPR Asian Conference on Pattern Recognition\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 2nd IAPR Asian Conference on Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACPR.2013.114\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 2nd IAPR Asian Conference on Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACPR.2013.114","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Spatial-Temporal Context for Action Recognition Combined with Confidence and Contribution Weight
In this paper, we propose a new method for human action analysis in videos. A video sequence of human action in our perspective can be modeled through feature distribution over spatial-temporal domain. Relationships between features and each defined action are also explored to form discriminative feature sets. In our work, we first capture contextual correlations between the local features through multiple windows. We then mine confidences from association rules and learn contributions from trained-SVM based on sample videos. Finally, through the analysis of feature distribution and their interactions over spatial-temporal domain, we combine the contexture correlations and the relationships between words and their related actions to derive weights of bag of feature words for action matching. In most of the case, our experiments have indicated that the new method outperforms other previous published results on the Weizmann and KTH datasets.