ResFNN: Residual Structure-Based Feedforward Neural Network for Action Quality Assessment in Sports Consumer Electronics

IF 4.3 2区 计算机科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Honghao Gao;Si Yu;Muddesar Iqbal;Mohsen Guizani
{"title":"ResFNN: Residual Structure-Based Feedforward Neural Network for Action Quality Assessment in Sports Consumer Electronics","authors":"Honghao Gao;Si Yu;Muddesar Iqbal;Mohsen Guizani","doi":"10.1109/TCE.2024.3482560","DOIUrl":null,"url":null,"abstract":"With the development of artificial intelligence (AI) and sports consumer electronics, AI-empowered Olympic sport technologies are being implemented more extensively. Action quality assessment (AQA), a sport action recognition and video refereeing technology, aims to automatically score action performance in videos obtained from sports consumer electronics deployed in arenas. It has gained much attention for its wide range of applications, such as sports event scoring, specific skill assessment, and rehabilitation medicine. General methods score action performance by directly regressing the initial video features to score, which neglects the possibility that the initial features are insufficiently effective. To address this issue, we propose a residual structure-based feedforward neural network (ResFNN) that enables efficient action feature learning to attain improved score assessment performance. First, the input videos are downsampled to clips and passed through inflated 3D convolutional networks (ConvNets) to obtain initial action video features. These features contain spatiotemporal information about the human actions occurring in the videos. Second, these features are aggregated and learned through our ResFNN. The ResFNN is composed of feedforward neural network residual blocks, which have strong function fitting and feature conversion capabilities. Therefore, the network learns features well and obtains more effective features. Third, a score distribution regression method is applied to obtain the underlying score distribution. This step establishes a more accurate mapping between the videos and scores. Finally, our method is demonstrated to outperform the majority of the existing methods through experiments conducted on the AQA-7, MTL-AQA, and JIGSAWS datasets.","PeriodicalId":13208,"journal":{"name":"IEEE Transactions on Consumer Electronics","volume":"70 4","pages":"6653-6663"},"PeriodicalIF":4.3000,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Consumer Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10720818/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

With the development of artificial intelligence (AI) and sports consumer electronics, AI-empowered Olympic sport technologies are being implemented more extensively. Action quality assessment (AQA), a sport action recognition and video refereeing technology, aims to automatically score action performance in videos obtained from sports consumer electronics deployed in arenas. It has gained much attention for its wide range of applications, such as sports event scoring, specific skill assessment, and rehabilitation medicine. General methods score action performance by directly regressing the initial video features to score, which neglects the possibility that the initial features are insufficiently effective. To address this issue, we propose a residual structure-based feedforward neural network (ResFNN) that enables efficient action feature learning to attain improved score assessment performance. First, the input videos are downsampled to clips and passed through inflated 3D convolutional networks (ConvNets) to obtain initial action video features. These features contain spatiotemporal information about the human actions occurring in the videos. Second, these features are aggregated and learned through our ResFNN. The ResFNN is composed of feedforward neural network residual blocks, which have strong function fitting and feature conversion capabilities. Therefore, the network learns features well and obtains more effective features. Third, a score distribution regression method is applied to obtain the underlying score distribution. This step establishes a more accurate mapping between the videos and scores. Finally, our method is demonstrated to outperform the majority of the existing methods through experiments conducted on the AQA-7, MTL-AQA, and JIGSAWS datasets.
基于残差结构的前馈神经网络用于体育消费电子产品的动作质量评估
随着人工智能(AI)和体育消费电子产品的发展,人工智能支持的奥运体育技术正在得到更广泛的应用。动作质量评估(AQA)是一种体育动作识别和视频裁判技术,其目的是对在赛场上部署的体育消费电子产品中获得的视频进行自动评分。由于其在体育赛事评分、专项技能评估、康复医学等方面的广泛应用,受到了广泛的关注。一般的方法是通过直接将初始视频特征回归到得分中来对动作表现进行评分,这忽略了初始特征不够有效的可能性。为了解决这个问题,我们提出了一种基于残差结构的前馈神经网络(ResFNN),它能够有效地学习动作特征,从而提高分数评估性能。首先,输入视频被下采样为片段,并通过膨胀的3D卷积网络(ConvNets)获得初始动作视频特征。这些特征包含了视频中发生的人类行为的时空信息。其次,通过我们的ResFNN对这些特征进行聚合和学习。ResFNN由前馈神经网络残差块组成,具有较强的函数拟合和特征转换能力。因此,网络可以很好地学习特征,得到更有效的特征。第三,采用分数分布回归方法获得底层分数分布。这一步在视频和分数之间建立了更精确的映射。最后,通过在AQA-7、MTL-AQA和JIGSAWS数据集上进行的实验,证明了我们的方法优于大多数现有方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.70
自引率
9.30%
发文量
59
审稿时长
3.3 months
期刊介绍: The main focus for the IEEE Transactions on Consumer Electronics is the engineering and research aspects of the theory, design, construction, manufacture or end use of mass market electronics, systems, software and services for consumers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信