样本外R2:估计和推断

The American Statistician Pub Date : 2023-02-10 DOI:10.1080/00031305.2023.2216252

Stijn Hawinkel, W. Waegeman, Steven Maere

{"title":"样本外R2:估计和推断","authors":"Stijn Hawinkel, W. Waegeman, Steven Maere","doi":"10.1080/00031305.2023.2216252","DOIUrl":null,"url":null,"abstract":"Out-of-sample prediction is the acid test of predictive models, yet an independent test dataset is often not available for assessment of the prediction error. For this reason, out-of-sample performance is commonly estimated using data splitting algorithms such as cross-validation or the bootstrap. For quantitative outcomes, the ratio of variance explained to total variance can be summarized by the coefficient of determination or in-sample $R^2$, which is easy to interpret and to compare across different outcome variables. As opposed to the in-sample $R^2$, the out-of-sample $R^2$ has not been well defined and the variability on the out-of-sample $\\hat{R}^2$ has been largely ignored. Usually only its point estimate is reported, hampering formal comparison of predictability of different outcome variables. Here we explicitly define the out-of-sample $R^2$ as a comparison of two predictive models, provide an unbiased estimator and exploit recent theoretical advances on uncertainty of data splitting estimates to provide a standard error for the $\\hat{R}^2$. The performance of the estimators for the $R^2$ and its standard error are investigated in a simulation study. We demonstrate our new method by constructing confidence intervals and comparing models for prediction of quantitative $\\text{Brassica napus}$ and $\\text{Zea mays}$ phenotypes based on gene expression data.","PeriodicalId":342642,"journal":{"name":"The American Statistician","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Out-of-sample R2: estimation and inference\",\"authors\":\"Stijn Hawinkel, W. Waegeman, Steven Maere\",\"doi\":\"10.1080/00031305.2023.2216252\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Out-of-sample prediction is the acid test of predictive models, yet an independent test dataset is often not available for assessment of the prediction error. For this reason, out-of-sample performance is commonly estimated using data splitting algorithms such as cross-validation or the bootstrap. For quantitative outcomes, the ratio of variance explained to total variance can be summarized by the coefficient of determination or in-sample $R^2$, which is easy to interpret and to compare across different outcome variables. As opposed to the in-sample $R^2$, the out-of-sample $R^2$ has not been well defined and the variability on the out-of-sample $\\\\hat{R}^2$ has been largely ignored. Usually only its point estimate is reported, hampering formal comparison of predictability of different outcome variables. Here we explicitly define the out-of-sample $R^2$ as a comparison of two predictive models, provide an unbiased estimator and exploit recent theoretical advances on uncertainty of data splitting estimates to provide a standard error for the $\\\\hat{R}^2$. The performance of the estimators for the $R^2$ and its standard error are investigated in a simulation study. We demonstrate our new method by constructing confidence intervals and comparing models for prediction of quantitative $\\\\text{Brassica napus}$ and $\\\\text{Zea mays}$ phenotypes based on gene expression data.\",\"PeriodicalId\":342642,\"journal\":{\"name\":\"The American Statistician\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The American Statistician\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/00031305.2023.2216252\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The American Statistician","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/00031305.2023.2216252","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

样本外预测是预测模型的严格检验，但通常没有独立的测试数据集来评估预测误差。出于这个原因，通常使用数据分割算法(如交叉验证或自举)来估计样本外性能。对于定量结果，解释的方差与总方差的比率可以用决定系数或样本内R^2来概括，这很容易解释和比较不同结果变量。与样本内$R^2$相反，样本外$R^2$没有得到很好的定义，并且样本外$\hat{R}^2$的可变性在很大程度上被忽略了。通常只报告其点估计，妨碍了对不同结果变量的可预测性进行正式比较。在这里，我们明确地将样本外$R^2$定义为两个预测模型的比较，提供了一个无偏估计量，并利用最近关于数据分割估计不确定性的理论进展，为$\hat{R}^2$提供了一个标准误差。仿真研究了R^2估计器的性能及其标准误差。我们通过构建置信区间和比较基于基因表达数据的定量预测$\text{油菜}$和$\text{玉米}$表型的模型来证明我们的新方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Out-of-sample R2: estimation and inference

Out-of-sample prediction is the acid test of predictive models, yet an independent test dataset is often not available for assessment of the prediction error. For this reason, out-of-sample performance is commonly estimated using data splitting algorithms such as cross-validation or the bootstrap. For quantitative outcomes, the ratio of variance explained to total variance can be summarized by the coefficient of determination or in-sample $R^2$, which is easy to interpret and to compare across different outcome variables. As opposed to the in-sample $R^2$, the out-of-sample $R^2$ has not been well defined and the variability on the out-of-sample $\hat{R}^2$ has been largely ignored. Usually only its point estimate is reported, hampering formal comparison of predictability of different outcome variables. Here we explicitly define the out-of-sample $R^2$ as a comparison of two predictive models, provide an unbiased estimator and exploit recent theoretical advances on uncertainty of data splitting estimates to provide a standard error for the $\hat{R}^2$. The performance of the estimators for the $R^2$ and its standard error are investigated in a simulation study. We demonstrate our new method by constructing confidence intervals and comparing models for prediction of quantitative $\text{Brassica napus}$ and $\text{Zea mays}$ phenotypes based on gene expression data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

The American Statistician

自引率

0.00%

发文量