{"title":"Predicting Sprint Potential: A Machine Learning Model Based on Blood Metabolite Profiles in Young Male Athletes","authors":"Jingfeng Chen, Yuhang Qian, Yuansheng Xu","doi":"10.1002/ejsc.12272","DOIUrl":null,"url":null,"abstract":"<p>This study aims to utilize male blood metabolite signatures for (i) distinguishing between healthy individuals and athletes, thereby optimizing the athlete screening process; and (ii) predicting athletic performance in 100, 200, and 400 m sprints, enhancing precompetition preparation and intervention strategies. Initially, we employed nontargeted metabolomics to analyze the blood metabolome of healthy individuals (<i>n</i> = 10) and athletes (<i>n</i> = 10), identifying differential expressed metabolites (DEMs) potentially related to athletic performance through differential analysis, consensus clustering, WGCNA, and UMAP analysis. Subsequently, using LASSO-Cox analysis, we refined our selection to two core DEMs: HMDB0012085 (Sphingomyelin (d18:0/14:0)) and HMDB0009224 (Phosphatidylethanolamine(20:0/18:1(9Z))) associated with athletic performance. We then applied targeted metabolomics to measure the levels of these DEMs in a larger cohort, including healthy individuals (<i>n</i> = 50) and athletes (<i>n</i> = 100), revealing a significant increase in the levels of HMDB0012085 and HMDB0009224 in athletes compared to healthy individuals. Utilizing 13 machine learning classification methods, we demonstrated that the levels of HMDB0012085 and HMDB0009224 in blood effectively differentiate between healthy individuals and athletes. Notably, HMDB0012085 exhibits greater feature importance across multiple algorithms compared to HMDB0009224. Specifically, in decision trees (94.1 vs. 5.9), random forests (60.7 vs. 39.3), gradient boosting trees (91.5 vs. 8.5), CatBoost (61.7 vs. 38.3), ExtraTrees (64.7 vs. 35.3), and XGBoost (74.5 vs. 25.5). Finally, we found a significant negative correlation between the levels of HMDB0012085 and HMDB0009224 in whole blood and sprint times for 100, 200, and 400 m races. In conclusion, HMDB0012085 and HMDB0009224 in whole blood hold promise as biomarkers for predicting athletic potential in males.</p>","PeriodicalId":93999,"journal":{"name":"European journal of sport science","volume":"25 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ejsc.12272","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European journal of sport science","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ejsc.12272","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This study aims to utilize male blood metabolite signatures for (i) distinguishing between healthy individuals and athletes, thereby optimizing the athlete screening process; and (ii) predicting athletic performance in 100, 200, and 400 m sprints, enhancing precompetition preparation and intervention strategies. Initially, we employed nontargeted metabolomics to analyze the blood metabolome of healthy individuals (n = 10) and athletes (n = 10), identifying differential expressed metabolites (DEMs) potentially related to athletic performance through differential analysis, consensus clustering, WGCNA, and UMAP analysis. Subsequently, using LASSO-Cox analysis, we refined our selection to two core DEMs: HMDB0012085 (Sphingomyelin (d18:0/14:0)) and HMDB0009224 (Phosphatidylethanolamine(20:0/18:1(9Z))) associated with athletic performance. We then applied targeted metabolomics to measure the levels of these DEMs in a larger cohort, including healthy individuals (n = 50) and athletes (n = 100), revealing a significant increase in the levels of HMDB0012085 and HMDB0009224 in athletes compared to healthy individuals. Utilizing 13 machine learning classification methods, we demonstrated that the levels of HMDB0012085 and HMDB0009224 in blood effectively differentiate between healthy individuals and athletes. Notably, HMDB0012085 exhibits greater feature importance across multiple algorithms compared to HMDB0009224. Specifically, in decision trees (94.1 vs. 5.9), random forests (60.7 vs. 39.3), gradient boosting trees (91.5 vs. 8.5), CatBoost (61.7 vs. 38.3), ExtraTrees (64.7 vs. 35.3), and XGBoost (74.5 vs. 25.5). Finally, we found a significant negative correlation between the levels of HMDB0012085 and HMDB0009224 in whole blood and sprint times for 100, 200, and 400 m races. In conclusion, HMDB0012085 and HMDB0009224 in whole blood hold promise as biomarkers for predicting athletic potential in males.