{"title":"基于机器学习的韩国职业棒球联赛观众发展预测模型","authors":"Jung-Hwan Cho, Boo-Gil Seok","doi":"10.35159/kjss.2023.10.32.5.547","DOIUrl":null,"url":null,"abstract":"[Purpose] The purpose of this study is to identify the main factors related to the prediction of the number of spectators in Korean professional baseball by using machine learning. [Methods] For the purpose of the study, the daily numbers of spectators for professional baseball from 2017 to 2019 were collected. External factors such as the weather and holidays on the day of the match and the internal situation of the match, such as the away team factor, were input as observation variables. The collected data was analyzed with Python ver 3.6, and the predictive power was cross-validated using three machine learning models: Lasso regression, random forest, and XGboost. [Results] As a result of the analysis, the XGboost model showed the highest predictive power and showed 58.4% accuracy when predicting the number of spectators for the entire KBO league. The most frequently used factor in the entire league was the ‘Date’ factor, and as a single-factor, holidays were the most frequently used in prediction. As for the factors for predicting the total number of spectators by team, the ‘Away team’ factor and the ‘Date’ factor were most frequently used. [Conclusions] Based on the results of this study, it is decided that teams and league will be able to suggest various marketing strategies if the number of spectators is predicted considering the game performance, opponent team, and weather.","PeriodicalId":497986,"journal":{"name":"The Korean Society of Sports Science","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Development prediction model of Korea Professional Baseball league spectator using machine learning\",\"authors\":\"Jung-Hwan Cho, Boo-Gil Seok\",\"doi\":\"10.35159/kjss.2023.10.32.5.547\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"[Purpose] The purpose of this study is to identify the main factors related to the prediction of the number of spectators in Korean professional baseball by using machine learning. [Methods] For the purpose of the study, the daily numbers of spectators for professional baseball from 2017 to 2019 were collected. External factors such as the weather and holidays on the day of the match and the internal situation of the match, such as the away team factor, were input as observation variables. The collected data was analyzed with Python ver 3.6, and the predictive power was cross-validated using three machine learning models: Lasso regression, random forest, and XGboost. [Results] As a result of the analysis, the XGboost model showed the highest predictive power and showed 58.4% accuracy when predicting the number of spectators for the entire KBO league. The most frequently used factor in the entire league was the ‘Date’ factor, and as a single-factor, holidays were the most frequently used in prediction. As for the factors for predicting the total number of spectators by team, the ‘Away team’ factor and the ‘Date’ factor were most frequently used. [Conclusions] Based on the results of this study, it is decided that teams and league will be able to suggest various marketing strategies if the number of spectators is predicted considering the game performance, opponent team, and weather.\",\"PeriodicalId\":497986,\"journal\":{\"name\":\"The Korean Society of Sports Science\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Korean Society of Sports Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.35159/kjss.2023.10.32.5.547\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Korean Society of Sports Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.35159/kjss.2023.10.32.5.547","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
【目的】本研究的目的是利用机器学习识别与韩国职业棒球观众人数预测相关的主要因素。【方法】为研究目的,收集2017 - 2019年职业棒球比赛的每日观众人数。输入比赛当天的天气和节假日等外部因素,以及比赛的内部情况,如客场球队因素,作为观察变量。使用Python ver 3.6对收集到的数据进行分析,并使用Lasso回归、随机森林和XGboost三种机器学习模型交叉验证预测能力。[结果]分析结果显示,XGboost模型在预测整个KBO联赛的观众人数时具有最高的预测能力,准确率达到58.4%。整个联盟中最常用的因素是“日期”因素,作为一个单一因素,假期是预测中最常用的因素。至于预测球队总观众人数的因素,“客场球队”因素和“日期”因素是最常用的。[结论]基于本研究的结果,决定球队和联赛将能够提出各种营销策略,如果预测观众的数量,考虑比赛成绩,对手球队和天气。
The Development prediction model of Korea Professional Baseball league spectator using machine learning
[Purpose] The purpose of this study is to identify the main factors related to the prediction of the number of spectators in Korean professional baseball by using machine learning. [Methods] For the purpose of the study, the daily numbers of spectators for professional baseball from 2017 to 2019 were collected. External factors such as the weather and holidays on the day of the match and the internal situation of the match, such as the away team factor, were input as observation variables. The collected data was analyzed with Python ver 3.6, and the predictive power was cross-validated using three machine learning models: Lasso regression, random forest, and XGboost. [Results] As a result of the analysis, the XGboost model showed the highest predictive power and showed 58.4% accuracy when predicting the number of spectators for the entire KBO league. The most frequently used factor in the entire league was the ‘Date’ factor, and as a single-factor, holidays were the most frequently used in prediction. As for the factors for predicting the total number of spectators by team, the ‘Away team’ factor and the ‘Date’ factor were most frequently used. [Conclusions] Based on the results of this study, it is decided that teams and league will be able to suggest various marketing strategies if the number of spectators is predicted considering the game performance, opponent team, and weather.