Skeleton Based Action Quality Assessment of Figure Skating Videos

2021 11th International Conference on Information Technology in Medicine and Education (ITME) Pub Date : 2021-11-01 DOI:10.1109/ITME53901.2021.00048

Huiyong Li, Qing Lei, Hongbo Zhang, Jixiang Du

{"title":"Skeleton Based Action Quality Assessment of Figure Skating Videos","authors":"Huiyong Li, Qing Lei, Hongbo Zhang, Jixiang Du","doi":"10.1109/ITME53901.2021.00048","DOIUrl":null,"url":null,"abstract":"Action quality assessment(AQA) aims at achieving automatic evaluation the performance of human actions in video. Compared with action recognition problem, AQA focuses more on subtle differences both in spatial and temporal dimensions during the whole executing process of actions. However, most existing AQA methods tried to extract features directly from RGB videos through a 3D ConvNets, which makes the features mixed with useless scene information. To overcome this problem, We propose a deep pose feature learning AQA method that captured detailed and meaningful representations for skeleton information to discover the subtle motion difference of AQA problem. We first apply pose estimation method to obtain human skeleton data from RGB videos. Then a spatio-temporal graph convolutional network (ST-GCN) is employed to extract the dynamic changes of skeleton data and obtain the representative pose features. Finally, a regressor composed of three fully connected layers is developed to reduce the dimension of the obtained pose features and predict the final score. Experiments on MIT figure skating dataset have been extensively conducted, and the results demonstrate that the proposed method has achieved improvements that outperformed current state-of-the-art methods.","PeriodicalId":6774,"journal":{"name":"2021 11th International Conference on Information Technology in Medicine and Education (ITME)","volume":"64 1","pages":"196-200"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 11th International Conference on Information Technology in Medicine and Education (ITME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITME53901.2021.00048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Action quality assessment(AQA) aims at achieving automatic evaluation the performance of human actions in video. Compared with action recognition problem, AQA focuses more on subtle differences both in spatial and temporal dimensions during the whole executing process of actions. However, most existing AQA methods tried to extract features directly from RGB videos through a 3D ConvNets, which makes the features mixed with useless scene information. To overcome this problem, We propose a deep pose feature learning AQA method that captured detailed and meaningful representations for skeleton information to discover the subtle motion difference of AQA problem. We first apply pose estimation method to obtain human skeleton data from RGB videos. Then a spatio-temporal graph convolutional network (ST-GCN) is employed to extract the dynamic changes of skeleton data and obtain the representative pose features. Finally, a regressor composed of three fully connected layers is developed to reduce the dimension of the obtained pose features and predict the final score. Experiments on MIT figure skating dataset have been extensively conducted, and the results demonstrate that the proposed method has achieved improvements that outperformed current state-of-the-art methods.

查看原文本刊更多论文

基于骨架的花样滑冰视频动作质量评价

动作质量评估(Action quality assessment, AQA)旨在实现对视频中人类动作性能的自动评估。与动作识别问题相比，AQA更关注整个动作执行过程中空间维度和时间维度的细微差异。然而，现有的大多数AQA方法都试图通过3D卷积神经网络直接从RGB视频中提取特征，这使得特征与无用的场景信息混合在一起。为了克服这一问题，我们提出了一种深度姿态特征学习AQA方法，该方法捕获了骨骼信息的详细和有意义的表示，以发现AQA问题的细微运动差异。首先应用姿态估计方法从RGB视频中获取人体骨骼数据。然后利用时空图卷积网络(ST-GCN)提取骨架数据的动态变化，得到具有代表性的姿态特征;最后，开发了一个由三个完全连接层组成的回归量，对得到的姿态特征进行降维并预测最终得分。在麻省理工学院花样滑冰数据集上进行了广泛的实验，结果表明，所提出的方法取得了优于当前最先进方法的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 11th International Conference on Information Technology in Medicine and Education (ITME)

自引率

0.00%

发文量