Arezoo Abasi, Ahmad Nazari, Azar Moezy, Seyed Ali Fatemi Aghda
{"title":"利用心肺运动测试 (CPET) 数据预测再受伤风险的机器学习模型:优化运动员的恢复。","authors":"Arezoo Abasi, Ahmad Nazari, Azar Moezy, Seyed Ali Fatemi Aghda","doi":"10.1186/s13040-025-00431-2","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Cardiopulmonary Exercise Testing (CPET) provides detailed insights into athletes' cardiovascular and pulmonary function, making it a valuable tool in assessing recovery and injury risks. However, traditional statistical models often fail to leverage the full potential of CPET data in predicting reinjury. Machine learning (ML) algorithms offer promising capabilities in uncovering complex patterns within this data, allowing for more accurate injury risk assessment.</p><p><strong>Objective: </strong>This study aimed to develop machine learning models to predict reinjury risk among elite soccer players using CPET data. Specifically, we sought to identify key physiological and performance variables that correlate with reinjury and to evaluate the performance of various ML algorithms in generating accurate predictions.</p><p><strong>Methods: </strong>A dataset of 256 elite soccer players from 16 national and top-tier teams in Iran was analyzed, incorporating physiological variables and categorical data. Several machine learning models, including CatBoost, SVM, Random Forest, and XGBoost, were employed to predict reinjury risk. Model performance was assessed using metrics such as accuracy, precision, recall, F1-score, AUC, and SHAP values to ensure robust evaluation and interpretability.</p><p><strong>Results: </strong>CatBoost and SVM exhibited the best performance, with CatBoost achieving the highest accuracy (0.9138) and F1-score (0.9148), and SVM achieving the highest AUC (0.9725). A significant association was found between a history of concussion and reinjury risk (χ² = 13.0360, p = 0.0015), highlighting the importance of neurological recovery in preventing future injuries. Heart rate metrics, particularly HRmax and HR2, were also significantly lower in players who experienced reinjury, indicating reduced cardiovascular capacity in this group.</p><p><strong>Conclusion: </strong>Machine learning models, particularly CatBoost and SVM, provide promising tools for predicting reinjury risk using CPET data. These models offer clinicians more precise, data-driven insights into athlete recovery and risk management. Future research should explore the integration of external factors such as training load and psychological readiness to further refine these predictions and enhance injury prevention protocols.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"16"},"PeriodicalIF":4.0000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11834553/pdf/","citationCount":"0","resultStr":"{\"title\":\"Machine learning models for reinjury risk prediction using cardiopulmonary exercise testing (CPET) data: optimizing athlete recovery.\",\"authors\":\"Arezoo Abasi, Ahmad Nazari, Azar Moezy, Seyed Ali Fatemi Aghda\",\"doi\":\"10.1186/s13040-025-00431-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Cardiopulmonary Exercise Testing (CPET) provides detailed insights into athletes' cardiovascular and pulmonary function, making it a valuable tool in assessing recovery and injury risks. However, traditional statistical models often fail to leverage the full potential of CPET data in predicting reinjury. Machine learning (ML) algorithms offer promising capabilities in uncovering complex patterns within this data, allowing for more accurate injury risk assessment.</p><p><strong>Objective: </strong>This study aimed to develop machine learning models to predict reinjury risk among elite soccer players using CPET data. Specifically, we sought to identify key physiological and performance variables that correlate with reinjury and to evaluate the performance of various ML algorithms in generating accurate predictions.</p><p><strong>Methods: </strong>A dataset of 256 elite soccer players from 16 national and top-tier teams in Iran was analyzed, incorporating physiological variables and categorical data. Several machine learning models, including CatBoost, SVM, Random Forest, and XGBoost, were employed to predict reinjury risk. Model performance was assessed using metrics such as accuracy, precision, recall, F1-score, AUC, and SHAP values to ensure robust evaluation and interpretability.</p><p><strong>Results: </strong>CatBoost and SVM exhibited the best performance, with CatBoost achieving the highest accuracy (0.9138) and F1-score (0.9148), and SVM achieving the highest AUC (0.9725). A significant association was found between a history of concussion and reinjury risk (χ² = 13.0360, p = 0.0015), highlighting the importance of neurological recovery in preventing future injuries. Heart rate metrics, particularly HRmax and HR2, were also significantly lower in players who experienced reinjury, indicating reduced cardiovascular capacity in this group.</p><p><strong>Conclusion: </strong>Machine learning models, particularly CatBoost and SVM, provide promising tools for predicting reinjury risk using CPET data. These models offer clinicians more precise, data-driven insights into athlete recovery and risk management. Future research should explore the integration of external factors such as training load and psychological readiness to further refine these predictions and enhance injury prevention protocols.</p>\",\"PeriodicalId\":48947,\"journal\":{\"name\":\"Biodata Mining\",\"volume\":\"18 1\",\"pages\":\"16\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2025-02-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11834553/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biodata Mining\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s13040-025-00431-2\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-025-00431-2","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
Machine learning models for reinjury risk prediction using cardiopulmonary exercise testing (CPET) data: optimizing athlete recovery.
Background: Cardiopulmonary Exercise Testing (CPET) provides detailed insights into athletes' cardiovascular and pulmonary function, making it a valuable tool in assessing recovery and injury risks. However, traditional statistical models often fail to leverage the full potential of CPET data in predicting reinjury. Machine learning (ML) algorithms offer promising capabilities in uncovering complex patterns within this data, allowing for more accurate injury risk assessment.
Objective: This study aimed to develop machine learning models to predict reinjury risk among elite soccer players using CPET data. Specifically, we sought to identify key physiological and performance variables that correlate with reinjury and to evaluate the performance of various ML algorithms in generating accurate predictions.
Methods: A dataset of 256 elite soccer players from 16 national and top-tier teams in Iran was analyzed, incorporating physiological variables and categorical data. Several machine learning models, including CatBoost, SVM, Random Forest, and XGBoost, were employed to predict reinjury risk. Model performance was assessed using metrics such as accuracy, precision, recall, F1-score, AUC, and SHAP values to ensure robust evaluation and interpretability.
Results: CatBoost and SVM exhibited the best performance, with CatBoost achieving the highest accuracy (0.9138) and F1-score (0.9148), and SVM achieving the highest AUC (0.9725). A significant association was found between a history of concussion and reinjury risk (χ² = 13.0360, p = 0.0015), highlighting the importance of neurological recovery in preventing future injuries. Heart rate metrics, particularly HRmax and HR2, were also significantly lower in players who experienced reinjury, indicating reduced cardiovascular capacity in this group.
Conclusion: Machine learning models, particularly CatBoost and SVM, provide promising tools for predicting reinjury risk using CPET data. These models offer clinicians more precise, data-driven insights into athlete recovery and risk management. Future research should explore the integration of external factors such as training load and psychological readiness to further refine these predictions and enhance injury prevention protocols.
期刊介绍:
BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data.
Topical areas include, but are not limited to:
-Development, evaluation, and application of novel data mining and machine learning algorithms.
-Adaptation, evaluation, and application of traditional data mining and machine learning algorithms.
-Open-source software for the application of data mining and machine learning algorithms.
-Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies.
-Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.