将模型选择的不确定性纳入生存数据的预后模型的实际效用

Statistical Modeling Pub Date : 2005-07-01 DOI:10.1191/1471082X05st089oa

N. Augustin, W. Sauerbrei, M. Schumacher

{"title":"将模型选择的不确定性纳入生存数据的预后模型的实际效用","authors":"N. Augustin, W. Sauerbrei, M. Schumacher","doi":"10.1191/1471082X05st089oa","DOIUrl":null,"url":null,"abstract":"Predictions of disease outcome in prognostic factor models are usually based on one selected model. However, often several models fit the data equally well, but these models might differ substantially in terms of included explanatory variables and might lead to different predictions for individual patients. For survival data, we discuss two approaches to account for model selection uncertainty in two data examples, with the main emphasis on variable selection in a proportional hazard Cox model. The main aim of our investigation is to establish the ways in which either of the two approaches is useful in such prognostic models. The first approach is Bayesian model averaging (BMA) adapted for the proportional hazard model, termed ‘approx. BMA’ here. As a new approach, we propose a method which averages over a set of possible models using weights estimated from bootstrap resampling as proposed by Buckland et al., but in addition, we perform an initial screening of variables based on the inclusion frequency of each variable to reduce the set of variables and corresponding models. For some necessary parameters of the procedure, investigations concerning sensible choices are still required. The main objective of prognostic models is prediction, but the interpretation of single effects is also important and models should be general enough to ensure transportability to other clinical centres. In the data examples, we compare predictions of our new approach with approx. BMA, with ‘conventional’ predictions from one selected model and with predictions from the full model. Confidence intervals are compared in one example. Comparisons are based on the partial predictive score and the Brier score. We conclude that the two model averaging methods yield similar results and are especially useful when there is a high number of potential prognostic factors, most likely some of them without influence in a multivariable context. Although the method based on bootstrap resampling lacks formal justification and requires some ad hoc decisions, it has the additional positive effect of achieving model parsimony by reducing the number of explanatory variables and dealing with correlated variables in an automatic fashion.","PeriodicalId":354759,"journal":{"name":"Statistical Modeling","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"44","resultStr":"{\"title\":\"The practical utility of incorporating model selection uncertainty into prognostic models for survival data\",\"authors\":\"N. Augustin, W. Sauerbrei, M. Schumacher\",\"doi\":\"10.1191/1471082X05st089oa\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Predictions of disease outcome in prognostic factor models are usually based on one selected model. However, often several models fit the data equally well, but these models might differ substantially in terms of included explanatory variables and might lead to different predictions for individual patients. For survival data, we discuss two approaches to account for model selection uncertainty in two data examples, with the main emphasis on variable selection in a proportional hazard Cox model. The main aim of our investigation is to establish the ways in which either of the two approaches is useful in such prognostic models. The first approach is Bayesian model averaging (BMA) adapted for the proportional hazard model, termed ‘approx. BMA’ here. As a new approach, we propose a method which averages over a set of possible models using weights estimated from bootstrap resampling as proposed by Buckland et al., but in addition, we perform an initial screening of variables based on the inclusion frequency of each variable to reduce the set of variables and corresponding models. For some necessary parameters of the procedure, investigations concerning sensible choices are still required. The main objective of prognostic models is prediction, but the interpretation of single effects is also important and models should be general enough to ensure transportability to other clinical centres. In the data examples, we compare predictions of our new approach with approx. BMA, with ‘conventional’ predictions from one selected model and with predictions from the full model. Confidence intervals are compared in one example. Comparisons are based on the partial predictive score and the Brier score. We conclude that the two model averaging methods yield similar results and are especially useful when there is a high number of potential prognostic factors, most likely some of them without influence in a multivariable context. Although the method based on bootstrap resampling lacks formal justification and requires some ad hoc decisions, it has the additional positive effect of achieving model parsimony by reducing the number of explanatory variables and dealing with correlated variables in an automatic fashion.\",\"PeriodicalId\":354759,\"journal\":{\"name\":\"Statistical Modeling\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"44\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistical Modeling\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1191/1471082X05st089oa\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Modeling","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1191/1471082X05st089oa","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 44

摘要

预后因素模型对疾病结果的预测通常基于一个选定的模型。然而，通常有几个模型都能很好地拟合数据，但这些模型在包含的解释变量方面可能存在很大差异，并可能导致对个体患者的不同预测。对于生存数据，我们讨论了两种方法来解释两个数据示例中的模型选择不确定性，主要强调比例风险Cox模型中的变量选择。我们调查的主要目的是建立两种方法中的任何一种在这种预测模型中有用的方法。第一种方法是贝叶斯模型平均(BMA)，适用于比例风险模型，称为“近似”。BMA的这里。作为一种新方法，我们提出了一种方法，该方法使用Buckland等人提出的自举重采样估计的权重对一组可能的模型进行平均，但此外，我们根据每个变量的包含频率对变量进行初始筛选，以减少变量集和相应的模型。对于程序的一些必要参数，仍然需要对合理选择进行调查。预后模型的主要目的是预测，但对单一效应的解释也很重要，模型应具有足够的通用性，以确保可移植到其他临床中心。在数据示例中，我们将新方法的预测与近似进行比较。BMA，一个选定模型的“传统”预测和一个完整模型的预测。在一个示例中比较置信区间。比较是基于部分预测评分和Brier评分。我们得出的结论是，两种模型平均方法产生相似的结果，并且在存在大量潜在预后因素时特别有用，其中一些因素很可能在多变量环境中没有影响。尽管基于自举重采样的方法缺乏正式的证明，并且需要一些特别的决定，但它具有通过减少解释变量的数量和以自动方式处理相关变量来实现模型简约的额外积极影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The practical utility of incorporating model selection uncertainty into prognostic models for survival data

Predictions of disease outcome in prognostic factor models are usually based on one selected model. However, often several models fit the data equally well, but these models might differ substantially in terms of included explanatory variables and might lead to different predictions for individual patients. For survival data, we discuss two approaches to account for model selection uncertainty in two data examples, with the main emphasis on variable selection in a proportional hazard Cox model. The main aim of our investigation is to establish the ways in which either of the two approaches is useful in such prognostic models. The first approach is Bayesian model averaging (BMA) adapted for the proportional hazard model, termed ‘approx. BMA’ here. As a new approach, we propose a method which averages over a set of possible models using weights estimated from bootstrap resampling as proposed by Buckland et al., but in addition, we perform an initial screening of variables based on the inclusion frequency of each variable to reduce the set of variables and corresponding models. For some necessary parameters of the procedure, investigations concerning sensible choices are still required. The main objective of prognostic models is prediction, but the interpretation of single effects is also important and models should be general enough to ensure transportability to other clinical centres. In the data examples, we compare predictions of our new approach with approx. BMA, with ‘conventional’ predictions from one selected model and with predictions from the full model. Confidence intervals are compared in one example. Comparisons are based on the partial predictive score and the Brier score. We conclude that the two model averaging methods yield similar results and are especially useful when there is a high number of potential prognostic factors, most likely some of them without influence in a multivariable context. Although the method based on bootstrap resampling lacks formal justification and requires some ad hoc decisions, it has the additional positive effect of achieving model parsimony by reducing the number of explanatory variables and dealing with correlated variables in an automatic fashion.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Statistical Modeling

自引率

0.00%

发文量