{"title":"识别人类行为","authors":"M. Shah","doi":"10.1145/1099396.1099397","DOIUrl":null,"url":null,"abstract":"Recognition of human actions from video sequences is a very active area of research in Computer Vision. An important step in any action recognition approach is the extraction of useful information form a raw video data and its subsequent representation. The representation should account for the variability that arises when arbitrary cameras capture humans performing actions.UCF Computer Vision group has been very active in action recognition area. In this talk, I will present our action recognition work employing a variety of representations: a single point, anatomical landmarks on the human body, and complete contour of the human body. I will also explicitly identify three important sources of variability: (1) viewpoint, (2) execution rate, and (3) anthropometry of actors, and propose a model of human actions that allows us to address all three. Our hypothesis is that the variability associated with the execution of an action can be closely approximated by a linear combination of action bases in joint spatio-temporal space. We demonstrate that such a model bounds the rank of a matrix of image measurements and that this bound can be used to achieve recognition of actions based only on imaged data.","PeriodicalId":196499,"journal":{"name":"Proceedings of the third ACM international workshop on Video surveillance & sensor networks","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"57","resultStr":"{\"title\":\"Recognizing human actions\",\"authors\":\"M. Shah\",\"doi\":\"10.1145/1099396.1099397\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recognition of human actions from video sequences is a very active area of research in Computer Vision. An important step in any action recognition approach is the extraction of useful information form a raw video data and its subsequent representation. The representation should account for the variability that arises when arbitrary cameras capture humans performing actions.UCF Computer Vision group has been very active in action recognition area. In this talk, I will present our action recognition work employing a variety of representations: a single point, anatomical landmarks on the human body, and complete contour of the human body. I will also explicitly identify three important sources of variability: (1) viewpoint, (2) execution rate, and (3) anthropometry of actors, and propose a model of human actions that allows us to address all three. Our hypothesis is that the variability associated with the execution of an action can be closely approximated by a linear combination of action bases in joint spatio-temporal space. We demonstrate that such a model bounds the rank of a matrix of image measurements and that this bound can be used to achieve recognition of actions based only on imaged data.\",\"PeriodicalId\":196499,\"journal\":{\"name\":\"Proceedings of the third ACM international workshop on Video surveillance & sensor networks\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"57\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the third ACM international workshop on Video surveillance & sensor networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1099396.1099397\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the third ACM international workshop on Video surveillance & sensor networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1099396.1099397","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Recognition of human actions from video sequences is a very active area of research in Computer Vision. An important step in any action recognition approach is the extraction of useful information form a raw video data and its subsequent representation. The representation should account for the variability that arises when arbitrary cameras capture humans performing actions.UCF Computer Vision group has been very active in action recognition area. In this talk, I will present our action recognition work employing a variety of representations: a single point, anatomical landmarks on the human body, and complete contour of the human body. I will also explicitly identify three important sources of variability: (1) viewpoint, (2) execution rate, and (3) anthropometry of actors, and propose a model of human actions that allows us to address all three. Our hypothesis is that the variability associated with the execution of an action can be closely approximated by a linear combination of action bases in joint spatio-temporal space. We demonstrate that such a model bounds the rank of a matrix of image measurements and that this bound can be used to achieve recognition of actions based only on imaged data.