{"title":"体育运动中跨模态自我中心人体动作识别的自适应不确定性建模与关系推理","authors":"Zhangzhi Zhao , Chen He , Xun Jiang , Xing Xu","doi":"10.1016/j.compeleceng.2025.110583","DOIUrl":null,"url":null,"abstract":"<div><div>Human action recognition (HAR) aims to interpret human actions from provided data and has found widespread application in the sports domain. However, HAR in sports presents unique challenges, as egocentric video often lacks complete body information. Additionally, individual variations and aleatoric errors during data collection can further corrupt cross-modal interactions. For example, the movement habits of different athletes, differences in body types, and changes in the wearing positions of sensors (such as IMUs) can lead to individual differences and random noise in inertial data. To address these issues, we emphasize the distinctiveness of human actions in sports and propose a novel framework for robust feature embedding and enhanced cross-modal interaction, termed <em><strong>S</strong>elf-adaptive <strong>U</strong>ncertainty <strong>M</strong>odeling and <strong>R</strong>elation <strong>R</strong>easoning (<strong>SUMRR</strong>)</em>, specifically designed for egocentric human action recognition in sports. Our approach begins with the sampling of robust unimodal features from an uncertainty perspective, which helps mitigate individual variance and reduce aleatoric errors. Furthermore, by meticulously modeling relation-level interactions across modalities, we construct robust cross-modal features that significantly enhance recognition performance. We evaluate our proposed SUMRR framework on notable cross-modal, egocentric sports datasets using various backbone architectures and achieve a remarkable 89.49% recognition precision. Experimental results demonstrate the portability and robustness of our SUMRR for egocentric human recognition in sports.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"127 ","pages":"Article 110583"},"PeriodicalIF":4.0000,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Self-adaptive uncertainty modeling and relation reasoning for cross-modal egocentric human action recognition in sports\",\"authors\":\"Zhangzhi Zhao , Chen He , Xun Jiang , Xing Xu\",\"doi\":\"10.1016/j.compeleceng.2025.110583\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Human action recognition (HAR) aims to interpret human actions from provided data and has found widespread application in the sports domain. However, HAR in sports presents unique challenges, as egocentric video often lacks complete body information. Additionally, individual variations and aleatoric errors during data collection can further corrupt cross-modal interactions. For example, the movement habits of different athletes, differences in body types, and changes in the wearing positions of sensors (such as IMUs) can lead to individual differences and random noise in inertial data. To address these issues, we emphasize the distinctiveness of human actions in sports and propose a novel framework for robust feature embedding and enhanced cross-modal interaction, termed <em><strong>S</strong>elf-adaptive <strong>U</strong>ncertainty <strong>M</strong>odeling and <strong>R</strong>elation <strong>R</strong>easoning (<strong>SUMRR</strong>)</em>, specifically designed for egocentric human action recognition in sports. Our approach begins with the sampling of robust unimodal features from an uncertainty perspective, which helps mitigate individual variance and reduce aleatoric errors. Furthermore, by meticulously modeling relation-level interactions across modalities, we construct robust cross-modal features that significantly enhance recognition performance. We evaluate our proposed SUMRR framework on notable cross-modal, egocentric sports datasets using various backbone architectures and achieve a remarkable 89.49% recognition precision. Experimental results demonstrate the portability and robustness of our SUMRR for egocentric human recognition in sports.</div></div>\",\"PeriodicalId\":50630,\"journal\":{\"name\":\"Computers & Electrical Engineering\",\"volume\":\"127 \",\"pages\":\"Article 110583\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2025-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Electrical Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0045790625005269\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625005269","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Self-adaptive uncertainty modeling and relation reasoning for cross-modal egocentric human action recognition in sports
Human action recognition (HAR) aims to interpret human actions from provided data and has found widespread application in the sports domain. However, HAR in sports presents unique challenges, as egocentric video often lacks complete body information. Additionally, individual variations and aleatoric errors during data collection can further corrupt cross-modal interactions. For example, the movement habits of different athletes, differences in body types, and changes in the wearing positions of sensors (such as IMUs) can lead to individual differences and random noise in inertial data. To address these issues, we emphasize the distinctiveness of human actions in sports and propose a novel framework for robust feature embedding and enhanced cross-modal interaction, termed Self-adaptive Uncertainty Modeling and Relation Reasoning (SUMRR), specifically designed for egocentric human action recognition in sports. Our approach begins with the sampling of robust unimodal features from an uncertainty perspective, which helps mitigate individual variance and reduce aleatoric errors. Furthermore, by meticulously modeling relation-level interactions across modalities, we construct robust cross-modal features that significantly enhance recognition performance. We evaluate our proposed SUMRR framework on notable cross-modal, egocentric sports datasets using various backbone architectures and achieve a remarkable 89.49% recognition precision. Experimental results demonstrate the portability and robustness of our SUMRR for egocentric human recognition in sports.
期刊介绍:
The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency.
Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.