HFNet:人类焦点运动动作识别的新模型

Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports Pub Date : 2020-10-12 DOI:10.1145/3422844.3423052

Lianyu Hu, Lin Feng, Sheng-lan Liu

{"title":"HFNet:人类焦点运动动作识别的新模型","authors":"Lianyu Hu, Lin Feng, Sheng-lan Liu","doi":"10.1145/3422844.3423052","DOIUrl":null,"url":null,"abstract":"Action recognition has attracted much attention recently and progressed remarkably. However, as a special kind of actions, sports action recognition is more difficult and deserves more attention. Our goal in this paper is to distinguish fine-grained human-focused sport actions. Sport actions can always be decomposed into sub-actions by body parts and it's necessary to establish the relationships among body parts and combine them together to perform classification. Besides, sport actions are usually fine-grained and subclasses a re similar which are hard to distinguish. Another tough problem in practice is to locate the actor in complicated circumstances. However, current methods in action recognition always pay attention to the whole image, thus failing to capture details and constructing relationships in images. In this paper, we propose a novel model to construct visual relationships in images through graph convolutions. We make use of patches cropped around body joints as input for graph nodes. Thus our model is able to pay attention to the changes and details of body parts. Then, we carefully design model to learn connections among graph nodes adaptively and empirically. We also provide another method to construct visual relationships for graph nodes. By specially focusing on relationships and details, our model achieves start-of-the-art performance on complex human-focused sports datasets FSD-10 and Diving48.","PeriodicalId":412304,"journal":{"name":"Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports","volume":"141 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"HFNet: A Novel Model for Human Focused Sports Action Recognition\",\"authors\":\"Lianyu Hu, Lin Feng, Sheng-lan Liu\",\"doi\":\"10.1145/3422844.3423052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Action recognition has attracted much attention recently and progressed remarkably. However, as a special kind of actions, sports action recognition is more difficult and deserves more attention. Our goal in this paper is to distinguish fine-grained human-focused sport actions. Sport actions can always be decomposed into sub-actions by body parts and it's necessary to establish the relationships among body parts and combine them together to perform classification. Besides, sport actions are usually fine-grained and subclasses a re similar which are hard to distinguish. Another tough problem in practice is to locate the actor in complicated circumstances. However, current methods in action recognition always pay attention to the whole image, thus failing to capture details and constructing relationships in images. In this paper, we propose a novel model to construct visual relationships in images through graph convolutions. We make use of patches cropped around body joints as input for graph nodes. Thus our model is able to pay attention to the changes and details of body parts. Then, we carefully design model to learn connections among graph nodes adaptively and empirically. We also provide another method to construct visual relationships for graph nodes. By specially focusing on relationships and details, our model achieves start-of-the-art performance on complex human-focused sports datasets FSD-10 and Diving48.\",\"PeriodicalId\":412304,\"journal\":{\"name\":\"Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports\",\"volume\":\"141 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3422844.3423052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3422844.3423052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

动作识别近年来受到广泛关注，并取得了显著进展。然而，运动动作作为一种特殊的动作，识别难度较大，值得重视。我们在本文中的目标是区分细粒度的以人为中心的运动动作。体育动作总是可以被身体部位分解成子动作，有必要建立身体部位之间的关系并将它们组合在一起进行分类。此外，运动动作通常是细粒度的，子类相似，很难区分。实践中的另一个难题是在复杂的情况下确定行为人的位置。然而，目前的动作识别方法总是关注整个图像，无法捕捉图像中的细节和构建图像中的关系。在本文中，我们提出了一种新的模型，通过图卷积来构建图像中的视觉关系。我们使用裁剪在身体关节周围的补丁作为图节点的输入。因此，我们的模型能够关注身体部位的变化和细节。然后，我们精心设计模型，自适应地、经验地学习图节点之间的联系。我们还提供了另一种方法来构建图节点的可视化关系。通过特别关注关系和细节，我们的模型在复杂的以人为中心的运动数据集FSD-10和Diving48上实现了最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

HFNet: A Novel Model for Human Focused Sports Action Recognition

Action recognition has attracted much attention recently and progressed remarkably. However, as a special kind of actions, sports action recognition is more difficult and deserves more attention. Our goal in this paper is to distinguish fine-grained human-focused sport actions. Sport actions can always be decomposed into sub-actions by body parts and it's necessary to establish the relationships among body parts and combine them together to perform classification. Besides, sport actions are usually fine-grained and subclasses a re similar which are hard to distinguish. Another tough problem in practice is to locate the actor in complicated circumstances. However, current methods in action recognition always pay attention to the whole image, thus failing to capture details and constructing relationships in images. In this paper, we propose a novel model to construct visual relationships in images through graph convolutions. We make use of patches cropped around body joints as input for graph nodes. Thus our model is able to pay attention to the changes and details of body parts. Then, we carefully design model to learn connections among graph nodes adaptively and empirically. We also provide another method to construct visual relationships for graph nodes. By specially focusing on relationships and details, our model achieves start-of-the-art performance on complex human-focused sports datasets FSD-10 and Diving48.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports

自引率

0.00%

发文量