Hong-Ming Qiu , Hong-Bo Zhang , Qing Lei , Jing-Hua Liu , Ji-Xiang Du
{"title":"Learning referee evaluation and assessing action quality from coarse to fine in diving sport","authors":"Hong-Ming Qiu , Hong-Bo Zhang , Qing Lei , Jing-Hua Liu , Ji-Xiang Du","doi":"10.1016/j.neucom.2025.130664","DOIUrl":null,"url":null,"abstract":"<div><div>Intelligently assessing the quality of athletic performances in sports scenarios remains a fascinating challenge in computer vision. However, unraveling the subtle distinctions between two similar actions in videos and mapping those video representations to quality scores remain significant obstacles. To address these challenges, this work redefines the paradigm of quality score estimation from traditional relative quality score prediction to relative referee score prediction. To make this shift, a cross-feature fusion module rooted in Transformer-based video representation is introduced, to improve pairwise video feature learning in the realm of action quality assessment. Then, a novel contrastive action parsing decoder module generates mid-level representations to effectively connect visual features with detailed quality scores. Both modules utilize cross-attention mechanisms; the former refines the pairwise video features to represent the differences between video pairs, while the latter updates the input queries corresponding to each referee’s evaluation. Finally, to achieve precise quality score estimation, we introduce a meticulous coarse-to-fine decision process, integrating a score classifier and offset regressor. After validation on challenging diving datasets, including MTL-AQA, FineDiving, and TASD-2, the experimental results show that the proposed approach demonstrates effectiveness and feasibility when compared with state-of-the-art methods.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"648 ","pages":"Article 130664"},"PeriodicalIF":5.5000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225013360","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Intelligently assessing the quality of athletic performances in sports scenarios remains a fascinating challenge in computer vision. However, unraveling the subtle distinctions between two similar actions in videos and mapping those video representations to quality scores remain significant obstacles. To address these challenges, this work redefines the paradigm of quality score estimation from traditional relative quality score prediction to relative referee score prediction. To make this shift, a cross-feature fusion module rooted in Transformer-based video representation is introduced, to improve pairwise video feature learning in the realm of action quality assessment. Then, a novel contrastive action parsing decoder module generates mid-level representations to effectively connect visual features with detailed quality scores. Both modules utilize cross-attention mechanisms; the former refines the pairwise video features to represent the differences between video pairs, while the latter updates the input queries corresponding to each referee’s evaluation. Finally, to achieve precise quality score estimation, we introduce a meticulous coarse-to-fine decision process, integrating a score classifier and offset regressor. After validation on challenging diving datasets, including MTL-AQA, FineDiving, and TASD-2, the experimental results show that the proposed approach demonstrates effectiveness and feasibility when compared with state-of-the-art methods.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.