篮球动作识别的进展：数据集、方法、可解释性和综合数据应用

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Image and Vision Computing Pub Date : 2025-08-07 DOI:10.1016/j.imavis.2025.105689

Marco Caruso , Lucia Cimmino , Fabio Narducci , Chiara Pero , Gianluca Ronga

{"title":"篮球动作识别的进展：数据集、方法、可解释性和综合数据应用","authors":"Marco Caruso , Lucia Cimmino , Fabio Narducci , Chiara Pero , Gianluca Ronga","doi":"10.1016/j.imavis.2025.105689","DOIUrl":null,"url":null,"abstract":"<div><div>Basketball Action Recognition (BAR) has received increasing attention in the fields of computer vision and artificial intelligence, serving as a fundamental component in performance evaluation, automated game annotation, tactical analysis, and referee decision-making support. Despite notable advancements driven by deep learning approaches, BAR remains a challenging task due to the inherent complexity of basketball movements, frequent occlusions, and limited availability of standardized benchmark datasets. This survey provides a comprehensive and structured synthesis of current developments in BAR research, encompassing four principal dimensions: dataset curation, computational methodologies, synthetic data generation, and model explainability. A critical analysis of publicly available basketball-specific datasets is presented, delineating their modalities, annotation strategies, action taxonomies, and representational scope. Furthermore, the survey offers a structured classification of state-of-the-art action recognition methodologies, ranging from video-based and skeleton-based models to sensor-driven and multimodal fusion approaches, emphasizing architectural characteristics, evaluation protocols, and task-specific adaptations. The role of synthetic data is systematically examined as a means to address data scarcity, reduce annotation noise, and enhance model generalization through controlled variability and simulation-based augmentation. In parallel, the integration of explainable artificial intelligence (XAI) techniques is also analyzed, with a focus on post-hoc attribution methods, probabilistic reasoning models, and interpretable neural architectures, aimed at improving the transparency and accountability of decision-making processes. The survey identifies persisting research challenges, including dataset heterogeneity, limitations in cross-domain transferability, and the accuracy-interpretability trade-off in deep models. By delineating current limitations and prospective directions, this work provides a foundational reference to guide the development of robust, generalizable, and explainable BAR systems for deployment in real-world sports intelligence applications.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"162 ","pages":"Article 105689"},"PeriodicalIF":4.2000,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Advancements in basketball action recognition: Datasets, methods, explainability, and synthetic data applications\",\"authors\":\"Marco Caruso , Lucia Cimmino , Fabio Narducci , Chiara Pero , Gianluca Ronga\",\"doi\":\"10.1016/j.imavis.2025.105689\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Basketball Action Recognition (BAR) has received increasing attention in the fields of computer vision and artificial intelligence, serving as a fundamental component in performance evaluation, automated game annotation, tactical analysis, and referee decision-making support. Despite notable advancements driven by deep learning approaches, BAR remains a challenging task due to the inherent complexity of basketball movements, frequent occlusions, and limited availability of standardized benchmark datasets. This survey provides a comprehensive and structured synthesis of current developments in BAR research, encompassing four principal dimensions: dataset curation, computational methodologies, synthetic data generation, and model explainability. A critical analysis of publicly available basketball-specific datasets is presented, delineating their modalities, annotation strategies, action taxonomies, and representational scope. Furthermore, the survey offers a structured classification of state-of-the-art action recognition methodologies, ranging from video-based and skeleton-based models to sensor-driven and multimodal fusion approaches, emphasizing architectural characteristics, evaluation protocols, and task-specific adaptations. The role of synthetic data is systematically examined as a means to address data scarcity, reduce annotation noise, and enhance model generalization through controlled variability and simulation-based augmentation. In parallel, the integration of explainable artificial intelligence (XAI) techniques is also analyzed, with a focus on post-hoc attribution methods, probabilistic reasoning models, and interpretable neural architectures, aimed at improving the transparency and accountability of decision-making processes. The survey identifies persisting research challenges, including dataset heterogeneity, limitations in cross-domain transferability, and the accuracy-interpretability trade-off in deep models. By delineating current limitations and prospective directions, this work provides a foundational reference to guide the development of robust, generalizable, and explainable BAR systems for deployment in real-world sports intelligence applications.</div></div>\",\"PeriodicalId\":50374,\"journal\":{\"name\":\"Image and Vision Computing\",\"volume\":\"162 \",\"pages\":\"Article 105689\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-08-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image and Vision Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S026288562500277X\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S026288562500277X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

篮球动作识别（BAR）在计算机视觉和人工智能领域受到越来越多的关注，作为性能评估、自动比赛注释、战术分析和裁判决策支持的基本组成部分。尽管深度学习方法取得了显著的进步，但由于篮球运动固有的复杂性、频繁的遮挡以及标准化基准数据集的有限可用性，BAR仍然是一项具有挑战性的任务。本调查对BAR研究的当前发展进行了全面和结构化的综合，包括四个主要方面：数据集管理、计算方法、合成数据生成和模型可解释性。对公开可用的篮球特定数据集进行了批判性分析，描述了它们的模式、注释策略、动作分类法和表示范围。此外，该调查还提供了最先进的动作识别方法的结构化分类，范围从基于视频和基于骨架的模型到传感器驱动和多模态融合方法，强调了架构特征、评估协议和特定任务的适应性。系统地研究了合成数据作为解决数据稀缺性、减少注释噪声和通过控制可变性和基于仿真的增强增强模型泛化的手段的作用。同时，还分析了可解释人工智能（XAI）技术的集成，重点是事后归因方法、概率推理模型和可解释神经架构，旨在提高决策过程的透明度和问责制。该调查指出了持续存在的研究挑战，包括数据集异质性、跨域可移植性的限制，以及深度模型中准确性和可解释性的权衡。通过描述当前的局限性和未来的方向，这项工作为指导开发健壮的、可推广的、可解释的BAR系统提供了基础参考，以部署在现实世界的体育智能应用中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Advancements in basketball action recognition: Datasets, methods, explainability, and synthetic data applications

Basketball Action Recognition (BAR) has received increasing attention in the fields of computer vision and artificial intelligence, serving as a fundamental component in performance evaluation, automated game annotation, tactical analysis, and referee decision-making support. Despite notable advancements driven by deep learning approaches, BAR remains a challenging task due to the inherent complexity of basketball movements, frequent occlusions, and limited availability of standardized benchmark datasets. This survey provides a comprehensive and structured synthesis of current developments in BAR research, encompassing four principal dimensions: dataset curation, computational methodologies, synthetic data generation, and model explainability. A critical analysis of publicly available basketball-specific datasets is presented, delineating their modalities, annotation strategies, action taxonomies, and representational scope. Furthermore, the survey offers a structured classification of state-of-the-art action recognition methodologies, ranging from video-based and skeleton-based models to sensor-driven and multimodal fusion approaches, emphasizing architectural characteristics, evaluation protocols, and task-specific adaptations. The role of synthetic data is systematically examined as a means to address data scarcity, reduce annotation noise, and enhance model generalization through controlled variability and simulation-based augmentation. In parallel, the integration of explainable artificial intelligence (XAI) techniques is also analyzed, with a focus on post-hoc attribution methods, probabilistic reasoning models, and interpretable neural architectures, aimed at improving the transparency and accountability of decision-making processes. The survey identifies persisting research challenges, including dataset heterogeneity, limitations in cross-domain transferability, and the accuracy-interpretability trade-off in deep models. By delineating current limitations and prospective directions, this work provides a foundational reference to guide the development of robust, generalizable, and explainable BAR systems for deployment in real-world sports intelligence applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Image and Vision Computing 工程技术-工程：电子与电气

CiteScore

8.50

自引率

8.50%

发文量

143

审稿时长

7.8 months

期刊介绍： Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.