从机器学习到统计学：美式橄榄球的预期得分案例

arXiv - STAT - Applications Pub Date : 2024-09-07 DOI:arxiv-2409.04889

Ryan S. Brill, Ryan Yee, Sameer K. Deshpande, Abraham J. Wyner

{"title":"从机器学习到统计学：美式橄榄球的预期得分案例","authors":"Ryan S. Brill, Ryan Yee, Sameer K. Deshpande, Abraham J. Wyner","doi":"arxiv-2409.04889","DOIUrl":null,"url":null,"abstract":"Expected points is a value function fundamental to player evaluation and\nstrategic in-game decision-making across sports analytics, particularly in\nAmerican football. To estimate expected points, football analysts use machine\nlearning tools, which are not equipped to handle certain challenges. They\nsuffer from selection bias, display counter-intuitive artifacts of overfitting,\ndo not quantify uncertainty in point estimates, and do not account for the\nstrong dependence structure of observational football data. These issues are\nnot unique to American football or even sports analytics; they are general\nproblems analysts encounter across various statistical applications,\nparticularly when using machine learning in lieu of traditional statistical\nmodels. We explore these issues in detail and devise expected points models\nthat account for them. We also introduce a widely applicable novel\nmethodological approach to mitigate overfitting, using a catalytic prior to\nsmooth our machine learning models.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"29 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Moving from Machine Learning to Statistics: the case of Expected Points in American football\",\"authors\":\"Ryan S. Brill, Ryan Yee, Sameer K. Deshpande, Abraham J. Wyner\",\"doi\":\"arxiv-2409.04889\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Expected points is a value function fundamental to player evaluation and\\nstrategic in-game decision-making across sports analytics, particularly in\\nAmerican football. To estimate expected points, football analysts use machine\\nlearning tools, which are not equipped to handle certain challenges. They\\nsuffer from selection bias, display counter-intuitive artifacts of overfitting,\\ndo not quantify uncertainty in point estimates, and do not account for the\\nstrong dependence structure of observational football data. These issues are\\nnot unique to American football or even sports analytics; they are general\\nproblems analysts encounter across various statistical applications,\\nparticularly when using machine learning in lieu of traditional statistical\\nmodels. We explore these issues in detail and devise expected points models\\nthat account for them. We also introduce a widely applicable novel\\nmethodological approach to mitigate overfitting, using a catalytic prior to\\nsmooth our machine learning models.\",\"PeriodicalId\":501172,\"journal\":{\"name\":\"arXiv - STAT - Applications\",\"volume\":\"29 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.04889\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.04889","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

预期得分是体育分析中球员评估和赛内战略决策的基本价值函数，在美式橄榄球中尤为如此。为了估算预期得分，足球分析师使用了机器学习工具，但这些工具并不具备应对某些挑战的能力。这些工具存在选择偏差，显示出过度拟合的反直觉假象，无法量化点估计中的不确定性，也无法解释足球观察数据的强依赖结构。这些问题并非美式橄榄球甚至体育分析所独有；它们是分析师在各种统计应用中遇到的普遍问题，尤其是在使用机器学习代替传统统计模型时。我们详细探讨了这些问题，并设计了能解决这些问题的预期积分模型。我们还介绍了一种广泛适用的新方法，利用催化先验来平滑我们的机器学习模型，从而缓解过拟合问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Moving from Machine Learning to Statistics: the case of Expected Points in American football

Expected points is a value function fundamental to player evaluation and strategic in-game decision-making across sports analytics, particularly in American football. To estimate expected points, football analysts use machine learning tools, which are not equipped to handle certain challenges. They suffer from selection bias, display counter-intuitive artifacts of overfitting, do not quantify uncertainty in point estimates, and do not account for the strong dependence structure of observational football data. These issues are not unique to American football or even sports analytics; they are general problems analysts encounter across various statistical applications, particularly when using machine learning in lieu of traditional statistical models. We explore these issues in detail and devise expected points models that account for them. We also introduce a widely applicable novel methodological approach to mitigate overfitting, using a catalytic prior to smooth our machine learning models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - STAT - Applications

自引率

0.00%

发文量