Skeleton-Based Motion Recognition for Labanotation Generation Based on the Fusion of Neural Networks

IF 1.7 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds Pub Date : 2025-09-08 DOI:10.1002/cav.70073

Jiasheng Du, Jiaji Wang, Jianpo Li

{"title":"Skeleton-Based Motion Recognition for Labanotation Generation Based on the Fusion of Neural Networks","authors":"Jiasheng Du, Jiaji Wang, Jianpo Li","doi":"10.1002/cav.70073","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Labanotation is a scientific method for documenting dance movements that has been widely adopted globally. Existing methods for Labanotation action recognition perform poorly in handling complex movements and integrating spatiotemporal information. To address this, we propose a multi-branch spatiotemporal fusion network with attention mechanisms aimed at accurately recognizing Labanotation actions from motion capture data. Initially, we convert motion capture data into three-dimensional coordinates and extract skeleton vector features. Subsequently, we enhance feature representation by extracting temporal difference features and skeleton angle features from the skeleton vectors. These features are processed using gated recurrent units and residual networks to effectively integrate spatiotemporal information. Finally, attention mechanisms are applied in the model to differentiate the importance of different positions in the features. This method effectively models spatiotemporal relationships, thereby improving the accuracy of Labanotation action recognition. We conducted experiments on two segmented motion capture datasets, demonstrating the effectiveness of each module. Compared to existing methods, our approach shows superior performance and strong generalization ability. Given the relative simplicity of upper limb action recognition, our focus primarily lies on lower limb action recognition. Notably, this marks the first application of skeleton angle features in the field of Labanotation action recognition.</p>\n </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 5","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Animation and Virtual Worlds","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cav.70073","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Labanotation is a scientific method for documenting dance movements that has been widely adopted globally. Existing methods for Labanotation action recognition perform poorly in handling complex movements and integrating spatiotemporal information. To address this, we propose a multi-branch spatiotemporal fusion network with attention mechanisms aimed at accurately recognizing Labanotation actions from motion capture data. Initially, we convert motion capture data into three-dimensional coordinates and extract skeleton vector features. Subsequently, we enhance feature representation by extracting temporal difference features and skeleton angle features from the skeleton vectors. These features are processed using gated recurrent units and residual networks to effectively integrate spatiotemporal information. Finally, attention mechanisms are applied in the model to differentiate the importance of different positions in the features. This method effectively models spatiotemporal relationships, thereby improving the accuracy of Labanotation action recognition. We conducted experiments on two segmented motion capture datasets, demonstrating the effectiveness of each module. Compared to existing methods, our approach shows superior performance and strong generalization ability. Given the relative simplicity of upper limb action recognition, our focus primarily lies on lower limb action recognition. Notably, this marks the first application of skeleton angle features in the field of Labanotation action recognition.

查看原文本刊更多论文

基于神经网络融合的骨骼运动识别

Labanotation是一种记录舞蹈动作的科学方法，已被全球广泛采用。现有的Labanotation动作识别方法在处理复杂动作和整合时空信息方面表现不佳。为了解决这个问题，我们提出了一个具有注意机制的多分支时空融合网络，旨在从动作捕捉数据中准确识别Labanotation动作。首先，我们将运动捕捉数据转换成三维坐标并提取骨架矢量特征。随后，我们通过从骨架向量中提取时间差特征和骨架角特征来增强特征表示。利用门控递归单元和残差网络对这些特征进行处理，有效地整合了时空信息。最后，在模型中应用注意机制来区分特征中不同位置的重要性。该方法有效地建立了时空关系模型，从而提高了标注动作识别的准确性。我们在两个分段运动捕捉数据集上进行了实验，验证了每个模块的有效性。与现有方法相比，我们的方法表现出了优越的性能和较强的泛化能力。鉴于上肢动作识别相对简单，我们的重点主要放在下肢动作识别上。值得注意的是，这标志着骨架角度特征在标注动作识别领域的首次应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Animation and Virtual Worlds 工程技术-计算机：软件工程

CiteScore

2.20

自引率

0.00%

发文量

审稿时长

6-12 weeks

期刊介绍： With the advent of very powerful PCs and high-end graphics cards, there has been an incredible development in Virtual Worlds, real-time computer animation and simulation, games. But at the same time, new and cheaper Virtual Reality devices have appeared allowing an interaction with these real-time Virtual Worlds and even with real worlds through Augmented Reality. Three-dimensional characters, especially Virtual Humans are now of an exceptional quality, which allows to use them in the movie industry. But this is only a beginning, as with the development of Artificial Intelligence and Agent technology, these characters will become more and more autonomous and even intelligent. They will inhabit the Virtual Worlds in a Virtual Life together with animals and plants.