体育运动中跨模态自我中心人体动作识别的自适应不确定性建模与关系推理

IF 4 3区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computers & Electrical Engineering Pub Date : 2025-07-26 DOI:10.1016/j.compeleceng.2025.110583

Zhangzhi Zhao , Chen He , Xun Jiang , Xing Xu

{"title":"体育运动中跨模态自我中心人体动作识别的自适应不确定性建模与关系推理","authors":"Zhangzhi Zhao , Chen He , Xun Jiang , Xing Xu","doi":"10.1016/j.compeleceng.2025.110583","DOIUrl":null,"url":null,"abstract":"<div><div>Human action recognition (HAR) aims to interpret human actions from provided data and has found widespread application in the sports domain. However, HAR in sports presents unique challenges, as egocentric video often lacks complete body information. Additionally, individual variations and aleatoric errors during data collection can further corrupt cross-modal interactions. For example, the movement habits of different athletes, differences in body types, and changes in the wearing positions of sensors (such as IMUs) can lead to individual differences and random noise in inertial data. To address these issues, we emphasize the distinctiveness of human actions in sports and propose a novel framework for robust feature embedding and enhanced cross-modal interaction, termed <em><strong>S</strong>elf-adaptive <strong>U</strong>ncertainty <strong>M</strong>odeling and <strong>R</strong>elation <strong>R</strong>easoning (<strong>SUMRR</strong>)</em>, specifically designed for egocentric human action recognition in sports. Our approach begins with the sampling of robust unimodal features from an uncertainty perspective, which helps mitigate individual variance and reduce aleatoric errors. Furthermore, by meticulously modeling relation-level interactions across modalities, we construct robust cross-modal features that significantly enhance recognition performance. We evaluate our proposed SUMRR framework on notable cross-modal, egocentric sports datasets using various backbone architectures and achieve a remarkable 89.49% recognition precision. Experimental results demonstrate the portability and robustness of our SUMRR for egocentric human recognition in sports.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"127 ","pages":"Article 110583"},"PeriodicalIF":4.0000,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Self-adaptive uncertainty modeling and relation reasoning for cross-modal egocentric human action recognition in sports\",\"authors\":\"Zhangzhi Zhao , Chen He , Xun Jiang , Xing Xu\",\"doi\":\"10.1016/j.compeleceng.2025.110583\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Human action recognition (HAR) aims to interpret human actions from provided data and has found widespread application in the sports domain. However, HAR in sports presents unique challenges, as egocentric video often lacks complete body information. Additionally, individual variations and aleatoric errors during data collection can further corrupt cross-modal interactions. For example, the movement habits of different athletes, differences in body types, and changes in the wearing positions of sensors (such as IMUs) can lead to individual differences and random noise in inertial data. To address these issues, we emphasize the distinctiveness of human actions in sports and propose a novel framework for robust feature embedding and enhanced cross-modal interaction, termed <em><strong>S</strong>elf-adaptive <strong>U</strong>ncertainty <strong>M</strong>odeling and <strong>R</strong>elation <strong>R</strong>easoning (<strong>SUMRR</strong>)</em>, specifically designed for egocentric human action recognition in sports. Our approach begins with the sampling of robust unimodal features from an uncertainty perspective, which helps mitigate individual variance and reduce aleatoric errors. Furthermore, by meticulously modeling relation-level interactions across modalities, we construct robust cross-modal features that significantly enhance recognition performance. We evaluate our proposed SUMRR framework on notable cross-modal, egocentric sports datasets using various backbone architectures and achieve a remarkable 89.49% recognition precision. Experimental results demonstrate the portability and robustness of our SUMRR for egocentric human recognition in sports.</div></div>\",\"PeriodicalId\":50630,\"journal\":{\"name\":\"Computers & Electrical Engineering\",\"volume\":\"127 \",\"pages\":\"Article 110583\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2025-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Electrical Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0045790625005269\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625005269","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

人体动作识别（HAR）旨在从提供的数据中解释人体动作，在体育领域得到了广泛的应用。然而，运动中的HAR呈现出独特的挑战，因为以自我为中心的视频通常缺乏完整的身体信息。此外，数据收集过程中的个体变化和任意错误可能进一步破坏跨模态相互作用。例如，不同运动员的运动习惯、体型的差异、传感器（如imu）佩戴位置的变化，都会导致惯性数据的个体差异和随机噪声。为了解决这些问题，我们强调了体育运动中人类行为的独特性，并提出了一种新的框架，用于鲁棒特征嵌入和增强跨模态交互，称为自适应不确定性建模和关系推理（SUMRR），专门为体育运动中以自我为中心的人类行为识别而设计。我们的方法首先从不确定性角度对鲁棒单峰特征进行采样，这有助于减轻个体方差并减少任意误差。此外，通过精心建模跨模态的关系级交互，我们构建了鲁棒的跨模态特征，显著提高了识别性能。我们使用不同的主干架构在显著的跨模态、以自我为中心的运动数据集上评估了我们提出的SUMRR框架，并获得了89.49%的显著识别精度。实验结果表明，该方法具有可移植性和鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Self-adaptive uncertainty modeling and relation reasoning for cross-modal egocentric human action recognition in sports

Human action recognition (HAR) aims to interpret human actions from provided data and has found widespread application in the sports domain. However, HAR in sports presents unique challenges, as egocentric video often lacks complete body information. Additionally, individual variations and aleatoric errors during data collection can further corrupt cross-modal interactions. For example, the movement habits of different athletes, differences in body types, and changes in the wearing positions of sensors (such as IMUs) can lead to individual differences and random noise in inertial data. To address these issues, we emphasize the distinctiveness of human actions in sports and propose a novel framework for robust feature embedding and enhanced cross-modal interaction, termed Self-adaptive Uncertainty Modeling and Relation Reasoning (SUMRR), specifically designed for egocentric human action recognition in sports. Our approach begins with the sampling of robust unimodal features from an uncertainty perspective, which helps mitigate individual variance and reduce aleatoric errors. Furthermore, by meticulously modeling relation-level interactions across modalities, we construct robust cross-modal features that significantly enhance recognition performance. We evaluate our proposed SUMRR framework on notable cross-modal, egocentric sports datasets using various backbone architectures and achieve a remarkable 89.49% recognition precision. Experimental results demonstrate the portability and robustness of our SUMRR for egocentric human recognition in sports.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers & Electrical Engineering 工程技术-工程：电子与电气

CiteScore

9.20

自引率

7.00%

发文量

661

审稿时长

47 days

期刊介绍： The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency. Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.