{"title":"Machine learning-driven market value prediction for European football players","authors":"Abdullah Tamim , Md. Wadud Jahan , Md. Rashid Shahriar Chowdhury , Ahammad Hossain , Md. Mizanur Rahman , A.H.M. Rahmatullah Imon","doi":"10.1016/j.jcmds.2025.100118","DOIUrl":null,"url":null,"abstract":"<div><div>Football is globally recognized as the most widely practiced and watched sport. Precise player value is crucial for clubs seeking to maximize their player acquisition strategy and overall success in football. Conventional player valuation methodologies are mainly dependent on expert judgments and subjective assessments, missing the objectivity and precision provided by data-driven approaches. This study seeks to close this disparity by utilizing machine learning techniques to predict the market valuations of football players. The analysis is conducted using an extensive dataset sourced from the FIFA 22 video game, which was obtained via sofifa.com. The collection includes more than 16,000 players. The Machine Learning (ML) techniques used in this study are Multiple Linear Regression (MLR), Ridge Regression (RR), Support Vector Regression (SVR), and Random Forest Regression (RFR). The machine learning algorithms undergo training using 80% of the samples and are subsequently tested using the remaining 20% of the samples. We evaluate each algorithm’s performance using metrics like Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R-squared (R<sup>2</sup>) value. Numerical results show that the RFR model demonstrates superior performance by achieving the lowest MAE, MSE, RMSE, and the highest R<sup>2</sup> value across all samples. The RFR effectively captures non-linear interactions and reliably prevents overfitting. This research finding will enhance the existing knowledge in sports economics by demonstrating how ML can be used to anticipate the market prices of football players with better accuracy. This will provide football teams with valuable insights to make more strategic decisions.</div></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"15 ","pages":"Article 100118"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Mathematics and Data Science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772415825000100","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Football is globally recognized as the most widely practiced and watched sport. Precise player value is crucial for clubs seeking to maximize their player acquisition strategy and overall success in football. Conventional player valuation methodologies are mainly dependent on expert judgments and subjective assessments, missing the objectivity and precision provided by data-driven approaches. This study seeks to close this disparity by utilizing machine learning techniques to predict the market valuations of football players. The analysis is conducted using an extensive dataset sourced from the FIFA 22 video game, which was obtained via sofifa.com. The collection includes more than 16,000 players. The Machine Learning (ML) techniques used in this study are Multiple Linear Regression (MLR), Ridge Regression (RR), Support Vector Regression (SVR), and Random Forest Regression (RFR). The machine learning algorithms undergo training using 80% of the samples and are subsequently tested using the remaining 20% of the samples. We evaluate each algorithm’s performance using metrics like Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R-squared (R2) value. Numerical results show that the RFR model demonstrates superior performance by achieving the lowest MAE, MSE, RMSE, and the highest R2 value across all samples. The RFR effectively captures non-linear interactions and reliably prevents overfitting. This research finding will enhance the existing knowledge in sports economics by demonstrating how ML can be used to anticipate the market prices of football players with better accuracy. This will provide football teams with valuable insights to make more strategic decisions.