Kinematic matrix: One-shot human action recognition using kinematic data structure

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence Pub Date : 2024-10-31 DOI:10.1016/j.engappai.2024.109569

Mohammad Hassan Ranjbar , Ali Abdi , Ju Hong Park

{"title":"Kinematic matrix: One-shot human action recognition using kinematic data structure","authors":"Mohammad Hassan Ranjbar , Ali Abdi , Ju Hong Park","doi":"10.1016/j.engappai.2024.109569","DOIUrl":null,"url":null,"abstract":"<div><div>One-shot action recognition, which refers to recognizing human-performed actions using only a single training example, holds significant promise in advancing video analysis, particularly in domains requiring rapid adaptation to new actions. However, existing algorithms for one-shot action recognition face multiple challenges, including high computational complexity, limited accuracy, and difficulties in generalization to unseen actions. To address these issues, we propose a novel kinematic-based skeleton representation that effectively reduces computational demands while enhancing recognition performance. This representation leverages skeleton locations, velocities, and accelerations to formulate the one-shot action recognition task as a metric learning problem, where a model projects kinematic data into an embedding space. In this space, actions are distinguished based on Euclidean distances, facilitating efficient nearest-neighbour searches among activity reference samples. Our approach not only reduces computational complexity but also achieves higher accuracy and better generalization compared to existing methods. Specifically, our model achieved a validation accuracy of 78.5%, outperforming state-of-the-art methods by 8.66% under comparable training conditions. These findings underscore the potential of our method for practical applications in real-time action recognition systems.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"139 ","pages":"Article 109569"},"PeriodicalIF":7.5000,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624017275","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

One-shot action recognition, which refers to recognizing human-performed actions using only a single training example, holds significant promise in advancing video analysis, particularly in domains requiring rapid adaptation to new actions. However, existing algorithms for one-shot action recognition face multiple challenges, including high computational complexity, limited accuracy, and difficulties in generalization to unseen actions. To address these issues, we propose a novel kinematic-based skeleton representation that effectively reduces computational demands while enhancing recognition performance. This representation leverages skeleton locations, velocities, and accelerations to formulate the one-shot action recognition task as a metric learning problem, where a model projects kinematic data into an embedding space. In this space, actions are distinguished based on Euclidean distances, facilitating efficient nearest-neighbour searches among activity reference samples. Our approach not only reduces computational complexity but also achieves higher accuracy and better generalization compared to existing methods. Specifically, our model achieved a validation accuracy of 78.5%, outperforming state-of-the-art methods by 8.66% under comparable training conditions. These findings underscore the potential of our method for practical applications in real-time action recognition systems.

查看原文本刊更多论文

运动学矩阵：利用运动学数据结构识别一帧人类动作

单次动作识别是指仅使用单个训练示例来识别人类所做动作，它在推进视频分析方面前景广阔，尤其是在需要快速适应新动作的领域。然而，现有的单次动作识别算法面临着多重挑战，包括计算复杂度高、准确性有限以及难以泛化到未见过的动作。为了解决这些问题，我们提出了一种新颖的基于运动学的骨架表示法，它能有效降低计算需求，同时提高识别性能。这种表示方法利用骨架位置、速度和加速度，将单次动作识别任务表述为一个度量学习问题，其中一个模型将运动学数据投射到一个嵌入空间。在这个空间中，动作是根据欧氏距离来区分的，这有利于在活动参考样本中进行高效的近邻搜索。与现有方法相比，我们的方法不仅降低了计算复杂度，还实现了更高的准确性和更好的泛化。具体来说，在可比的训练条件下，我们的模型达到了 78.5% 的验证准确率，比最先进的方法高出 8.66%。这些发现凸显了我们的方法在实时动作识别系统中的实际应用潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Engineering Applications of Artificial Intelligence 工程技术-工程：电子与电气

CiteScore

9.60

自引率

10.00%

发文量

505

审稿时长

68 days

期刊介绍： Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.