Machine Learning in Rugby Union: Predicting and Identifying Key Performance Indicators for Professional Rugby Union Players in Match Play Based Workload

IF 3
Xiangyu Ren, Simon Boisbluche, Kilian Philippe, Mathieu Demy, Sami Äyrämö, Ilkka Rautiainen, Shuzhe Ding, Jacques Prioux
{"title":"Machine Learning in Rugby Union: Predicting and Identifying Key Performance Indicators for Professional Rugby Union Players in Match Play Based Workload","authors":"Xiangyu Ren,&nbsp;Simon Boisbluche,&nbsp;Kilian Philippe,&nbsp;Mathieu Demy,&nbsp;Sami Äyrämö,&nbsp;Ilkka Rautiainen,&nbsp;Shuzhe Ding,&nbsp;Jacques Prioux","doi":"10.1002/ejsc.70042","DOIUrl":null,"url":null,"abstract":"<p>Rugby union is an intermittent high-intensity contact sport requiring the analysis of various training and match metrics. Time-motion analysis and video analysis have enhanced the understanding of the interplay between these two factors. However, limited studies have investigated the effect of workload on key performance indicators (KPIs) during matches. In this study, data collected from the global positioning system (GPS) were used to calculate cumulative workload values over 7, 14, and 21 days prior to each game. After dimensionality reduction through principal component analysis (PCA), these workload values were employed as features, with game KPIs as target variables. Modeling was conducted using linear regression (LR), support vector regression (SVR), random forest regression (RFR), and light gradient boosting machine (LightGBM) for regression tasks. The superiority of the model was assessed by coefficient of determination (<span></span><math></math>), root mean square error (<span></span><math></math>), and correlation coefficient (<span></span><math></math>). The findings revealed that although individual GPS metrics exhibited weak correlations with KPIs, machine learning (ML) models particularly RFR, successfully captured complex interactions and nonlinear relationships. These models achieved significantly improved predictive performance, with <span></span><math></math> values ranging from 0.40 to 0.72 for certain KPIs. Using SHapley Additive exPlanations (SHAP) analysis and partial dependence plots, this study enhanced the interpretability of ML models by identifying the influence of GPS features on KPIs and exploring their underlying mechanisms. These findings offer actionable insights for workload management, emphasizing critical factors that affect player performance.</p>","PeriodicalId":93999,"journal":{"name":"European journal of sport science","volume":"25 9","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ejsc.70042","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European journal of sport science","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ejsc.70042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Rugby union is an intermittent high-intensity contact sport requiring the analysis of various training and match metrics. Time-motion analysis and video analysis have enhanced the understanding of the interplay between these two factors. However, limited studies have investigated the effect of workload on key performance indicators (KPIs) during matches. In this study, data collected from the global positioning system (GPS) were used to calculate cumulative workload values over 7, 14, and 21 days prior to each game. After dimensionality reduction through principal component analysis (PCA), these workload values were employed as features, with game KPIs as target variables. Modeling was conducted using linear regression (LR), support vector regression (SVR), random forest regression (RFR), and light gradient boosting machine (LightGBM) for regression tasks. The superiority of the model was assessed by coefficient of determination (), root mean square error (), and correlation coefficient (). The findings revealed that although individual GPS metrics exhibited weak correlations with KPIs, machine learning (ML) models particularly RFR, successfully captured complex interactions and nonlinear relationships. These models achieved significantly improved predictive performance, with values ranging from 0.40 to 0.72 for certain KPIs. Using SHapley Additive exPlanations (SHAP) analysis and partial dependence plots, this study enhanced the interpretability of ML models by identifying the influence of GPS features on KPIs and exploring their underlying mechanisms. These findings offer actionable insights for workload management, emphasizing critical factors that affect player performance.

Abstract Image

橄榄球联盟中的机器学习:预测和识别基于比赛工作量的职业橄榄球联盟球员的关键绩效指标
橄榄球联盟是一项间歇性的高强度身体接触运动,需要分析各种训练和比赛指标。时间-运动分析和视频分析增强了对这两个因素之间相互作用的理解。然而,有限的研究调查了比赛期间工作量对关键绩效指标(kpi)的影响。在这项研究中,从全球定位系统(GPS)收集的数据用于计算每场比赛前7,14和21天的累积工作量值。通过主成分分析(PCA)降维后,将这些工作负荷值作为特征,以游戏kpi为目标变量。采用线性回归(LR)、支持向量回归(SVR)、随机森林回归(RFR)和光梯度增强机(LightGBM)进行建模。通过决定系数()、均方根误差()和相关系数()来评价模型的优越性。研究结果表明,尽管单个GPS指标与kpi的相关性较弱,但机器学习(ML)模型(尤其是RFR)成功捕获了复杂的相互作用和非线性关系。这些模型显著提高了预测性能,某些kpi的值在0.40到0.72之间。本研究利用SHapley加性解释(SHapley Additive explanation, SHAP)分析和部分依赖图,通过识别GPS特征对kpi的影响并探索其潜在机制,增强了ML模型的可解释性。这些发现为工作量管理提供了可行的见解,强调了影响玩家表现的关键因素。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信