{"title":"基于Huffman编码和隐式动作模型的动作识别","authors":"Nijun Li, Tongchi Zhou, Lin Zhou, Zhen-yang Wu","doi":"10.1109/CIVEMSA.2015.7158603","DOIUrl":null,"url":null,"abstract":"Human action recognition is at the core of computer vision, and has great application value in intelligent human-computer interactions. On the basis of Bag-of-Words (BoW), this work presents a Huffman coding and Implicit Action Model (IAM) combined framework for action recognition. Specifically, Huffman coding, which outperforms naïve Bayesian method, is a robust estimation of visual words' conditional probabilities; whereas IAM captures the spatio-temporal relationships of local features and outperforms most other common machine learning methods. Spatio-Temporal Interest Points (STIPs) and Harris corners are employed as local features, and multichannel feature description is adopted to exploit the complementarity among different features. Experiments on UCF-YouTube and HOHA2 datasets systematically compare the performance of various feature channels and machine learning methods, demonstrating the effectiveness of the approaches proposed by this paper. Finally, multiple augment mechanisms such as feature fusion, hierarchical codebooks and sparse coding are integrated into the recognition system, achieving the best ever performance comparing with the state-of-the-art.","PeriodicalId":348918,"journal":{"name":"2015 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Action recognition by Huffman coding and implicit action model\",\"authors\":\"Nijun Li, Tongchi Zhou, Lin Zhou, Zhen-yang Wu\",\"doi\":\"10.1109/CIVEMSA.2015.7158603\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human action recognition is at the core of computer vision, and has great application value in intelligent human-computer interactions. On the basis of Bag-of-Words (BoW), this work presents a Huffman coding and Implicit Action Model (IAM) combined framework for action recognition. Specifically, Huffman coding, which outperforms naïve Bayesian method, is a robust estimation of visual words' conditional probabilities; whereas IAM captures the spatio-temporal relationships of local features and outperforms most other common machine learning methods. Spatio-Temporal Interest Points (STIPs) and Harris corners are employed as local features, and multichannel feature description is adopted to exploit the complementarity among different features. Experiments on UCF-YouTube and HOHA2 datasets systematically compare the performance of various feature channels and machine learning methods, demonstrating the effectiveness of the approaches proposed by this paper. Finally, multiple augment mechanisms such as feature fusion, hierarchical codebooks and sparse coding are integrated into the recognition system, achieving the best ever performance comparing with the state-of-the-art.\",\"PeriodicalId\":348918,\"journal\":{\"name\":\"2015 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA)\",\"volume\":\"48 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIVEMSA.2015.7158603\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIVEMSA.2015.7158603","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Action recognition by Huffman coding and implicit action model
Human action recognition is at the core of computer vision, and has great application value in intelligent human-computer interactions. On the basis of Bag-of-Words (BoW), this work presents a Huffman coding and Implicit Action Model (IAM) combined framework for action recognition. Specifically, Huffman coding, which outperforms naïve Bayesian method, is a robust estimation of visual words' conditional probabilities; whereas IAM captures the spatio-temporal relationships of local features and outperforms most other common machine learning methods. Spatio-Temporal Interest Points (STIPs) and Harris corners are employed as local features, and multichannel feature description is adopted to exploit the complementarity among different features. Experiments on UCF-YouTube and HOHA2 datasets systematically compare the performance of various feature channels and machine learning methods, demonstrating the effectiveness of the approaches proposed by this paper. Finally, multiple augment mechanisms such as feature fusion, hierarchical codebooks and sparse coding are integrated into the recognition system, achieving the best ever performance comparing with the state-of-the-art.