通过基于表现数据的机器学习框架增强中国篮球联赛的比赛结果预测。

IF 3.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES
Yuhua Zhong
{"title":"通过基于表现数据的机器学习框架增强中国篮球联赛的比赛结果预测。","authors":"Yuhua Zhong","doi":"10.1038/s41598-025-08882-7","DOIUrl":null,"url":null,"abstract":"<p><p>Basketball remains among the most globally popular sports, with its various competitions drawing substantial attention. The analysis and modeling of basketball game data have long been central topics in sports analytics. In recent years, integrating machine learning techniques has facilitated significant advancements in predicting basketball game outcomes. However, most existing studies predominantly focus on NBA data, with relatively limited exploration of other leagues. To address this research gap, this study utilizes game data from the Chinese Basketball Association spanning the 2021-2024 seasons to develop predictive models. This research is the first to apply the classical Four Factors model and DefenseOfense model, along with their derivative versions (Four Factors detailed model and DefenseOfense detailed model), to the Chinese Men's Professional Basketball League, providing a baseline for prediction. To ensure practical applicability of the models and enable their effective use in real-world scenarios, this study exclusively uses data available before the start of each game as feature variables for training. This approach ensures that the enhanced models can perform well in theoretical evaluations and provide reliable predictions when applied in practice. To evaluate model performance, a diverse set of machine learning algorithms, including support vector machines, Naive Bayes, k-nearest neighbors, logistic regression, multi-layer perceptron with contrastive loss, and XGBoost are employed, with metrics such as Accuracy, F1 Score, Recall, Precision, and AUROC used for comparison. The results reveal that the incorporation of additional features substantially enhances predictive performance. In particular, under the Logistic Regression framework, the newly developed model based on the Four Factors detailed achieves an accuracy of 85.49%, representing the highest predictive performance among all the evaluated approaches.</p>","PeriodicalId":21811,"journal":{"name":"Scientific Reports","volume":"15 1","pages":"23788"},"PeriodicalIF":3.9000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12229449/pdf/","citationCount":"0","resultStr":"{\"title\":\"Enhancing game outcome prediction in the Chinese basketball league through a machine learning framework based on performance data.\",\"authors\":\"Yuhua Zhong\",\"doi\":\"10.1038/s41598-025-08882-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Basketball remains among the most globally popular sports, with its various competitions drawing substantial attention. The analysis and modeling of basketball game data have long been central topics in sports analytics. In recent years, integrating machine learning techniques has facilitated significant advancements in predicting basketball game outcomes. However, most existing studies predominantly focus on NBA data, with relatively limited exploration of other leagues. To address this research gap, this study utilizes game data from the Chinese Basketball Association spanning the 2021-2024 seasons to develop predictive models. This research is the first to apply the classical Four Factors model and DefenseOfense model, along with their derivative versions (Four Factors detailed model and DefenseOfense detailed model), to the Chinese Men's Professional Basketball League, providing a baseline for prediction. To ensure practical applicability of the models and enable their effective use in real-world scenarios, this study exclusively uses data available before the start of each game as feature variables for training. This approach ensures that the enhanced models can perform well in theoretical evaluations and provide reliable predictions when applied in practice. To evaluate model performance, a diverse set of machine learning algorithms, including support vector machines, Naive Bayes, k-nearest neighbors, logistic regression, multi-layer perceptron with contrastive loss, and XGBoost are employed, with metrics such as Accuracy, F1 Score, Recall, Precision, and AUROC used for comparison. The results reveal that the incorporation of additional features substantially enhances predictive performance. In particular, under the Logistic Regression framework, the newly developed model based on the Four Factors detailed achieves an accuracy of 85.49%, representing the highest predictive performance among all the evaluated approaches.</p>\",\"PeriodicalId\":21811,\"journal\":{\"name\":\"Scientific Reports\",\"volume\":\"15 1\",\"pages\":\"23788\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12229449/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Reports\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41598-025-08882-7\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Reports","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41598-025-08882-7","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

篮球仍然是全球最受欢迎的运动之一,其各种比赛吸引了大量关注。篮球比赛数据的分析和建模一直是体育分析的中心话题。近年来,整合机器学习技术在预测篮球比赛结果方面取得了重大进展。然而,大多数现有的研究主要集中在NBA的数据上,对其他联盟的探索相对有限。为了解决这一研究空白,本研究利用中国篮协2021-2024赛季的比赛数据来建立预测模型。本研究首次将经典的四因素模型和防守进攻模型及其衍生版本(四因素详细模型和防守进攻详细模型)应用于中国男子职业篮球联赛,为预测提供了基线。为了确保模型的实际适用性,并使其在现实场景中有效使用,本研究只使用每场比赛开始前的可用数据作为特征变量进行训练。这种方法保证了增强模型在理论评价中表现良好,在实际应用中提供可靠的预测。为了评估模型的性能,我们使用了多种机器学习算法,包括支持向量机、朴素贝叶斯、k近邻、逻辑回归、带有对比损失的多层感知器和XGBoost,并使用Accuracy、F1 Score、Recall、Precision和AUROC等指标进行比较。结果表明,附加特征的结合大大提高了预测性能。特别是,在Logistic回归框架下,基于四因素的新模型的预测准确率达到85.49%,是所有评估方法中预测性能最高的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Enhancing game outcome prediction in the Chinese basketball league through a machine learning framework based on performance data.

Enhancing game outcome prediction in the Chinese basketball league through a machine learning framework based on performance data.

Enhancing game outcome prediction in the Chinese basketball league through a machine learning framework based on performance data.

Enhancing game outcome prediction in the Chinese basketball league through a machine learning framework based on performance data.

Basketball remains among the most globally popular sports, with its various competitions drawing substantial attention. The analysis and modeling of basketball game data have long been central topics in sports analytics. In recent years, integrating machine learning techniques has facilitated significant advancements in predicting basketball game outcomes. However, most existing studies predominantly focus on NBA data, with relatively limited exploration of other leagues. To address this research gap, this study utilizes game data from the Chinese Basketball Association spanning the 2021-2024 seasons to develop predictive models. This research is the first to apply the classical Four Factors model and DefenseOfense model, along with their derivative versions (Four Factors detailed model and DefenseOfense detailed model), to the Chinese Men's Professional Basketball League, providing a baseline for prediction. To ensure practical applicability of the models and enable their effective use in real-world scenarios, this study exclusively uses data available before the start of each game as feature variables for training. This approach ensures that the enhanced models can perform well in theoretical evaluations and provide reliable predictions when applied in practice. To evaluate model performance, a diverse set of machine learning algorithms, including support vector machines, Naive Bayes, k-nearest neighbors, logistic regression, multi-layer perceptron with contrastive loss, and XGBoost are employed, with metrics such as Accuracy, F1 Score, Recall, Precision, and AUROC used for comparison. The results reveal that the incorporation of additional features substantially enhances predictive performance. In particular, under the Logistic Regression framework, the newly developed model based on the Four Factors detailed achieves an accuracy of 85.49%, representing the highest predictive performance among all the evaluated approaches.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Scientific Reports
Scientific Reports Natural Science Disciplines-
CiteScore
7.50
自引率
4.30%
发文量
19567
审稿时长
3.9 months
期刊介绍: We publish original research from all areas of the natural sciences, psychology, medicine and engineering. You can learn more about what we publish by browsing our specific scientific subject areas below or explore Scientific Reports by browsing all articles and collections. Scientific Reports has a 2-year impact factor: 4.380 (2021), and is the 6th most-cited journal in the world, with more than 540,000 citations in 2020 (Clarivate Analytics, 2021). •Engineering Engineering covers all aspects of engineering, technology, and applied science. It plays a crucial role in the development of technologies to address some of the world''s biggest challenges, helping to save lives and improve the way we live. •Physical sciences Physical sciences are those academic disciplines that aim to uncover the underlying laws of nature — often written in the language of mathematics. It is a collective term for areas of study including astronomy, chemistry, materials science and physics. •Earth and environmental sciences Earth and environmental sciences cover all aspects of Earth and planetary science and broadly encompass solid Earth processes, surface and atmospheric dynamics, Earth system history, climate and climate change, marine and freshwater systems, and ecology. It also considers the interactions between humans and these systems. •Biological sciences Biological sciences encompass all the divisions of natural sciences examining various aspects of vital processes. The concept includes anatomy, physiology, cell biology, biochemistry and biophysics, and covers all organisms from microorganisms, animals to plants. •Health sciences The health sciences study health, disease and healthcare. This field of study aims to develop knowledge, interventions and technology for use in healthcare to improve the treatment of patients.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信