{"title":"HFNet:人类焦点运动动作识别的新模型","authors":"Lianyu Hu, Lin Feng, Sheng-lan Liu","doi":"10.1145/3422844.3423052","DOIUrl":null,"url":null,"abstract":"Action recognition has attracted much attention recently and progressed remarkably. However, as a special kind of actions, sports action recognition is more difficult and deserves more attention. Our goal in this paper is to distinguish fine-grained human-focused sport actions. Sport actions can always be decomposed into sub-actions by body parts and it's necessary to establish the relationships among body parts and combine them together to perform classification. Besides, sport actions are usually fine-grained and subclasses a re similar which are hard to distinguish. Another tough problem in practice is to locate the actor in complicated circumstances. However, current methods in action recognition always pay attention to the whole image, thus failing to capture details and constructing relationships in images. In this paper, we propose a novel model to construct visual relationships in images through graph convolutions. We make use of patches cropped around body joints as input for graph nodes. Thus our model is able to pay attention to the changes and details of body parts. Then, we carefully design model to learn connections among graph nodes adaptively and empirically. We also provide another method to construct visual relationships for graph nodes. By specially focusing on relationships and details, our model achieves start-of-the-art performance on complex human-focused sports datasets FSD-10 and Diving48.","PeriodicalId":412304,"journal":{"name":"Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports","volume":"141 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"HFNet: A Novel Model for Human Focused Sports Action Recognition\",\"authors\":\"Lianyu Hu, Lin Feng, Sheng-lan Liu\",\"doi\":\"10.1145/3422844.3423052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Action recognition has attracted much attention recently and progressed remarkably. However, as a special kind of actions, sports action recognition is more difficult and deserves more attention. Our goal in this paper is to distinguish fine-grained human-focused sport actions. Sport actions can always be decomposed into sub-actions by body parts and it's necessary to establish the relationships among body parts and combine them together to perform classification. Besides, sport actions are usually fine-grained and subclasses a re similar which are hard to distinguish. Another tough problem in practice is to locate the actor in complicated circumstances. However, current methods in action recognition always pay attention to the whole image, thus failing to capture details and constructing relationships in images. In this paper, we propose a novel model to construct visual relationships in images through graph convolutions. We make use of patches cropped around body joints as input for graph nodes. Thus our model is able to pay attention to the changes and details of body parts. Then, we carefully design model to learn connections among graph nodes adaptively and empirically. We also provide another method to construct visual relationships for graph nodes. By specially focusing on relationships and details, our model achieves start-of-the-art performance on complex human-focused sports datasets FSD-10 and Diving48.\",\"PeriodicalId\":412304,\"journal\":{\"name\":\"Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports\",\"volume\":\"141 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3422844.3423052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3422844.3423052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
HFNet: A Novel Model for Human Focused Sports Action Recognition
Action recognition has attracted much attention recently and progressed remarkably. However, as a special kind of actions, sports action recognition is more difficult and deserves more attention. Our goal in this paper is to distinguish fine-grained human-focused sport actions. Sport actions can always be decomposed into sub-actions by body parts and it's necessary to establish the relationships among body parts and combine them together to perform classification. Besides, sport actions are usually fine-grained and subclasses a re similar which are hard to distinguish. Another tough problem in practice is to locate the actor in complicated circumstances. However, current methods in action recognition always pay attention to the whole image, thus failing to capture details and constructing relationships in images. In this paper, we propose a novel model to construct visual relationships in images through graph convolutions. We make use of patches cropped around body joints as input for graph nodes. Thus our model is able to pay attention to the changes and details of body parts. Then, we carefully design model to learn connections among graph nodes adaptively and empirically. We also provide another method to construct visual relationships for graph nodes. By specially focusing on relationships and details, our model achieves start-of-the-art performance on complex human-focused sports datasets FSD-10 and Diving48.