Rejoinder to Next Generation Models for Subsequent Sports Injuries by Wu et al.

IF 1.5 4区数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Applied Stochastic Models in Business and Industry Pub Date : 2025-07-30 DOI:10.1002/asmb.70035

Paul Pao-Yen Wu, Yu Yi Yu, Liam A. Toohey, Michael Drew, Scott A. Sisson, Clara Grazian, Kerrie Mengersen

{"title":"Rejoinder to Next Generation Models for Subsequent Sports Injuries by Wu et al.","authors":"Paul Pao-Yen Wu, Yu Yi Yu, Liam A. Toohey, Michael Drew, Scott A. Sisson, Clara Grazian, Kerrie Mengersen","doi":"10.1002/asmb.70035","DOIUrl":null,"url":null,"abstract":"We greatly appreciate the commentary and positive feedback of discussants Prof. Jialiang Li and Dr. Rhythm Grover to enrich our paper and its context.As noted by Prof. Li, survival models are highly applicable to the subsequent sports injury problem given the temporal dimension of injury data. In the sporting context, censoring can arise, for example, from finite surveillance windows associated with a sporting season, athletes joining and leaving a team, or even extended absence due to injury [1, 2]. However, given the complex systems nature of individual athletes and potentially changing dynamics and susceptibility to injury over time, it is also important to capture the changing state of the athlete explicitly [3]. For example, increasing strength with training over a season could reduce injury risk; however, a serious injury such as an ACL injury could lead to increased susceptibility to subsequent injuries.Our paper presented a pragmatic approach, as noted by Dr. Grover, to tackle the challenges of modeling subsequent injury, reducing dimensionality through a time-varying Cox Proportional Hazards (PH) model, and using a discrete-time HMM to capture changes in susceptibility and covariate effects over time. Both Prof. Li and Dr. Grover note the potential computational challenge associated with Hidden Markov Models (HMMs) especially in the presence of large-scale and high-dimensional datasets. Hence, the need for dimension reduction, which was undertaken using survival modeling to explicitly cater for the time-to-event nature of injury data and censoring. The appropriateness of using the survival model was supported by checks of the assumptions of the PH model (e.g., proportional hazards, Schoenfeld residuals) and validation results (concordance index) as reported in our paper.In addition to computational complexity, however, is the somewhat associated challenge of model convergence. Greater model complexity, such as more HMM states or more model covariates, can lead to challenges with model identifiability, estimation, computation, and thus model convergence [4]. This is a current research challenge when faced with limited data as in our subsequent injury application, which is limited to 33 players and 2523 training and competition sessions over one season. Computationally, the proposed discrete-time HMM fitted with Expectation Maximization (EM) took approximately 155 s to converge for the entire team of players over one season, compared to less than a second for the Cox PH model. However, model convergence with more than two states could not be achieved with this limited dataset. Therefore, although the computational cost is feasible in this case study, the data available can limit the level of model complexity that can be achieved. Hence, it highlights the utility of the proposed combination of dimension reduction and state space modelling as a more generalizable approach, and the need for more research in this area.Along these lines, both Prof. Li and Dr. Grover discuss the challenge of computationally efficient inference and future research in this area, including bootstrapping, additive hazards model, and model-free dimension reduction with censored data. In addition, or in combination with bootstrapping, Bayesian inference of HMMs [5, 6] could be another avenue for investigation for inference with limited data and to help overcome potential challenges of local maxima with EM. Furthermore, a combined approach to variable selection and inference could potentially capture influential variables in a HMM context that are not marginally impactful in a survival model, as noted by Dr. Grover. This is challenging due to the computational and inferential complexity noted above; however, methods for learning Dynamic Bayesian Networks from high dimensional data, which are a generalized form of HMMs, could potentially be adapted to subsequent injury [7, 8]. Another avenue for investigation could be the Continuous Time HMM (CTHMM) as a way to potentially better capture non-uniform time intervals between observations. However, the CTHMM incurs additional model complexity compared to the discrete HMM, as both state transition times and the number of state transitions between observations need to be estimated [4].In this study, the data was limited to one Australian Football League (AFL) club over the course of one season. Potentially, with additional data over more seasons and/or clubs, we could better assess and study model transferability and generalizability as noted by Dr. Grover. The proposed HMM, however, was able to assess the injury risk of individual players of differing playing positions, exposures, and loads, suggesting some level of generalizability. Larger datasets could also enable the application of modern methods for machine learning including recurrent neural networks and temporal convolutional networks, which to date have been understudied in the subsequent injury domain. Generally, neural networks require large datasets and are challenging to interpret, but can produce very high predictive performance [9]. One potential approach for future research to help address the practical challenge of limited data in elite sports could be to train a model on a larger injury dataset and re-train it for a specific sports club [10]. This is important because, with the competitive nature of sport and the movement of players between clubs, obtaining complete data for individual athletes over time is challenging.The authors declare no conflicts of interest.","PeriodicalId":55495,"journal":{"name":"Applied Stochastic Models in Business and Industry","volume":"41 4","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asmb.70035","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Stochastic Models in Business and Industry","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/asmb.70035","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

We greatly appreciate the commentary and positive feedback of discussants Prof. Jialiang Li and Dr. Rhythm Grover to enrich our paper and its context.

As noted by Prof. Li, survival models are highly applicable to the subsequent sports injury problem given the temporal dimension of injury data. In the sporting context, censoring can arise, for example, from finite surveillance windows associated with a sporting season, athletes joining and leaving a team, or even extended absence due to injury [1, 2]. However, given the complex systems nature of individual athletes and potentially changing dynamics and susceptibility to injury over time, it is also important to capture the changing state of the athlete explicitly [3]. For example, increasing strength with training over a season could reduce injury risk; however, a serious injury such as an ACL injury could lead to increased susceptibility to subsequent injuries.

Our paper presented a pragmatic approach, as noted by Dr. Grover, to tackle the challenges of modeling subsequent injury, reducing dimensionality through a time-varying Cox Proportional Hazards (PH) model, and using a discrete-time HMM to capture changes in susceptibility and covariate effects over time. Both Prof. Li and Dr. Grover note the potential computational challenge associated with Hidden Markov Models (HMMs) especially in the presence of large-scale and high-dimensional datasets. Hence, the need for dimension reduction, which was undertaken using survival modeling to explicitly cater for the time-to-event nature of injury data and censoring. The appropriateness of using the survival model was supported by checks of the assumptions of the PH model (e.g., proportional hazards, Schoenfeld residuals) and validation results (concordance index) as reported in our paper.

In addition to computational complexity, however, is the somewhat associated challenge of model convergence. Greater model complexity, such as more HMM states or more model covariates, can lead to challenges with model identifiability, estimation, computation, and thus model convergence [4]. This is a current research challenge when faced with limited data as in our subsequent injury application, which is limited to 33 players and 2523 training and competition sessions over one season. Computationally, the proposed discrete-time HMM fitted with Expectation Maximization (EM) took approximately 155 s to converge for the entire team of players over one season, compared to less than a second for the Cox PH model. However, model convergence with more than two states could not be achieved with this limited dataset. Therefore, although the computational cost is feasible in this case study, the data available can limit the level of model complexity that can be achieved. Hence, it highlights the utility of the proposed combination of dimension reduction and state space modelling as a more generalizable approach, and the need for more research in this area.

Along these lines, both Prof. Li and Dr. Grover discuss the challenge of computationally efficient inference and future research in this area, including bootstrapping, additive hazards model, and model-free dimension reduction with censored data. In addition, or in combination with bootstrapping, Bayesian inference of HMMs [5, 6] could be another avenue for investigation for inference with limited data and to help overcome potential challenges of local maxima with EM. Furthermore, a combined approach to variable selection and inference could potentially capture influential variables in a HMM context that are not marginally impactful in a survival model, as noted by Dr. Grover. This is challenging due to the computational and inferential complexity noted above; however, methods for learning Dynamic Bayesian Networks from high dimensional data, which are a generalized form of HMMs, could potentially be adapted to subsequent injury [7, 8]. Another avenue for investigation could be the Continuous Time HMM (CTHMM) as a way to potentially better capture non-uniform time intervals between observations. However, the CTHMM incurs additional model complexity compared to the discrete HMM, as both state transition times and the number of state transitions between observations need to be estimated [4].

In this study, the data was limited to one Australian Football League (AFL) club over the course of one season. Potentially, with additional data over more seasons and/or clubs, we could better assess and study model transferability and generalizability as noted by Dr. Grover. The proposed HMM, however, was able to assess the injury risk of individual players of differing playing positions, exposures, and loads, suggesting some level of generalizability. Larger datasets could also enable the application of modern methods for machine learning including recurrent neural networks and temporal convolutional networks, which to date have been understudied in the subsequent injury domain. Generally, neural networks require large datasets and are challenging to interpret, but can produce very high predictive performance [9]. One potential approach for future research to help address the practical challenge of limited data in elite sports could be to train a model on a larger injury dataset and re-train it for a specific sports club [10]. This is important because, with the competitive nature of sport and the movement of players between clubs, obtaining complete data for individual athletes over time is challenging.

The authors declare no conflicts of interest.

查看原文本刊更多论文

吴等人对下一代运动损伤模型的反驳。

我们非常感谢讨论者李家良教授和格罗弗博士的评论和积极反馈，他们丰富了我们的论文及其背景。正如李教授所指出的，考虑到损伤数据的时间维度，生存模型非常适用于随后的运动损伤问题。在体育环境中，审查可能会出现，例如，与体育赛季相关的有限监视窗口，运动员加入和离开球队，甚至由于受伤而长期缺席[1,2]。然而，考虑到运动员个体复杂的系统特性，以及随着时间的推移可能发生的动态变化和对损伤的易感性，明确地捕捉运动员不断变化的状态也很重要。例如，在一个赛季的训练中增加力量可以降低受伤的风险；然而，像前交叉韧带损伤这样的严重损伤可能会导致对后续损伤的易感性增加。正如Grover博士所指出的那样，我们的论文提出了一种实用的方法来解决建模后续损伤的挑战，通过时变Cox比例风险（PH）模型降低维数，并使用离散时间HMM来捕获随时间变化的易感性和协变量效应。李教授和Grover博士都注意到与隐马尔可夫模型（hmm）相关的潜在计算挑战，特别是在大规模和高维数据集的存在下。因此，需要使用生存模型进行降维，以明确地满足损伤数据和审查的时间到事件性质。通过对PH模型的假设（如比例风险、舍恩菲尔德残差）和验证结果（一致性指数）的检查，支持了使用生存模型的适当性。然而，除了计算复杂性之外，还有与模型收敛相关的挑战。更高的模型复杂性，例如更多的HMM状态或更多的模型协变量，可能导致模型可识别性、估计、计算以及模型收敛[4]方面的挑战。这是当前的研究挑战，因为在我们随后的伤病申请中，数据有限，一个赛季仅限于33名球员和2523次训练和比赛。在计算上，与期望最大化（EM）相匹配的离散时间HMM在一个赛季内对整个团队的球员进行收敛大约需要155秒，而Cox PH模型只需要不到1秒。然而，在这个有限的数据集上，不能实现两个以上状态的模型收敛。因此，尽管在本案例研究中计算成本是可行的，但可用的数据可能会限制可以实现的模型复杂性水平。因此，它强调了将降维和状态空间建模相结合作为一种更通用的方法的实用性，以及在这一领域进行更多研究的必要性。沿着这些思路，李教授和Grover博士讨论了计算效率推断的挑战和该领域的未来研究，包括自举、加性危险模型和剔除数据的无模型降维。此外，或与自举相结合，HMM的贝叶斯推理[5,6]可能是研究有限数据推理的另一种途径，并有助于克服EM局部最大值的潜在挑战。此外，变量选择和推理的组合方法可能会捕获HMM背景下的有影响力的变量，这些变量在生存模型中没有边际影响，正如Grover博士所指出的那样。由于上面提到的计算和推理的复杂性，这是具有挑战性的；然而，从高维数据（hmm的广义形式）中学习动态贝叶斯网络的方法可能适用于后续损伤[7,8]。另一个研究途径可能是连续时间HMM (CTHMM)，作为一种潜在的更好地捕获观测之间非均匀时间间隔的方法。然而，与离散HMM相比，CTHMM产生了额外的模型复杂性，因为状态转换时间和观测值之间的状态转换次数都需要估计。在这项研究中，数据仅限于一个澳大利亚足球联盟（AFL）俱乐部一个赛季的过程。正如格罗弗博士所指出的那样，有了更多赛季和/或俱乐部的额外数据，我们可以更好地评估和研究模型的可转移性和普遍性。然而，提议的HMM能够评估不同位置、暴露和负荷的个体球员的受伤风险，这表明了某种程度的普遍性。更大的数据集还可以应用现代机器学习方法，包括递归神经网络和时间卷积网络，迄今为止，这些方法在后续损伤领域的研究还不够充分。一般来说，神经网络需要大量的数据集，并且很难解释，但可以产生非常高的预测性能。未来研究的一种潜在方法可以帮助解决精英运动中有限数据的实际挑战，即在更大的损伤数据集上训练模型，并为特定的体育俱乐部重新训练模型。这一点很重要，因为由于体育运动的竞争性和运动员在俱乐部之间的流动，获得运动员个人长期的完整数据是具有挑战性的。作者声明无利益冲突。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Stochastic Models in Business and Industry 数学-数学跨学科应用

CiteScore

2.70

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： ASMBI - Applied Stochastic Models in Business and Industry (formerly Applied Stochastic Models and Data Analysis) was first published in 1985, publishing contributions in the interface between stochastic modelling, data analysis and their applications in business, finance, insurance, management and production. In 2007 ASMBI became the official journal of the International Society for Business and Industrial Statistics (www.isbis.org). The main objective is to publish papers, both technical and practical, presenting new results which solve real-life problems or have great potential in doing so. Mathematical rigour, innovative stochastic modelling and sound applications are the key ingredients of papers to be published, after a very selective review process. The journal is very open to new ideas, like Data Science and Big Data stemming from problems in business and industry or uncertainty quantification in engineering, as well as more traditional ones, like reliability, quality control, design of experiments, managerial processes, supply chains and inventories, insurance, econometrics, financial modelling (provided the papers are related to real problems). The journal is interested also in papers addressing the effects of business and industrial decisions on the environment, healthcare, social life. State-of-the art computational methods are very welcome as well, when combined with sound applications and innovative models.