The Need for Diagnostic Assessment of Bootstrap Predictive Models

UNSW Business School Research Paper Series Pub Date : 2008-05-26 DOI:10.2139/SSRN.1134607

Glen Barnett, B. Zehnwirth

{"title":"The Need for Diagnostic Assessment of Bootstrap Predictive Models","authors":"Glen Barnett, B. Zehnwirth","doi":"10.2139/SSRN.1134607","DOIUrl":null,"url":null,"abstract":"The bootstrap is, at heart, a way to obtain an approximate sampling distribution for a statistic (and hence, if required, produce a confidence interval). Where that statistic is a suitable estimator for a population parameter of interest, the bootstrap enables inferences about that parameter. In the case of simple situations the bootstrap is very simple in form, but more complex situations can be dealt with. The bootstrap can be modified in order to produce a predictive distribution (and hence, if required, prediction intervals). It is predictive distributions that are generally of prime interest to insurers (because they pay the outcome of the process, not its mean). The bootstrap has become quite popular in reserving in recent years, but it's necessary to use the bootstrap with caution. The bootstrap does not require the user to assume a distribution for the data. Instead, sampling distributions are obtained by resampling the data. However, the bootstrap certainly does not avoid the need for assumptions, nor for checking those assumptions. The bootstrap is far from a cure-all. It suffers from essentially the same problems as finding predictive distributions and sampling distributions of statistics by any other means. These problems are exacerbated by the time-series nature of the forecasting problem - because reserving requires prediction into never-before-observed calendar periods, model inadequacy in the calendar year direction becomes a critical problem. In particular, the most popular actuarial techniques - those most often used with the bootstrap - don't have any parameters in that direction, and are frequently mis-specified with respect to the behaviour against calendar time. Further, commonly used versions of the bootstrap can be sensitive to overparameterization - and this is a common problem with standard techniques. In this paper, we describe these common problems in using the bootstrap and how to spot them.","PeriodicalId":23435,"journal":{"name":"UNSW Business School Research Paper Series","volume":"127 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2008-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"UNSW Business School Research Paper Series","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/SSRN.1134607","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

The bootstrap is, at heart, a way to obtain an approximate sampling distribution for a statistic (and hence, if required, produce a confidence interval). Where that statistic is a suitable estimator for a population parameter of interest, the bootstrap enables inferences about that parameter. In the case of simple situations the bootstrap is very simple in form, but more complex situations can be dealt with. The bootstrap can be modified in order to produce a predictive distribution (and hence, if required, prediction intervals). It is predictive distributions that are generally of prime interest to insurers (because they pay the outcome of the process, not its mean). The bootstrap has become quite popular in reserving in recent years, but it's necessary to use the bootstrap with caution. The bootstrap does not require the user to assume a distribution for the data. Instead, sampling distributions are obtained by resampling the data. However, the bootstrap certainly does not avoid the need for assumptions, nor for checking those assumptions. The bootstrap is far from a cure-all. It suffers from essentially the same problems as finding predictive distributions and sampling distributions of statistics by any other means. These problems are exacerbated by the time-series nature of the forecasting problem - because reserving requires prediction into never-before-observed calendar periods, model inadequacy in the calendar year direction becomes a critical problem. In particular, the most popular actuarial techniques - those most often used with the bootstrap - don't have any parameters in that direction, and are frequently mis-specified with respect to the behaviour against calendar time. Further, commonly used versions of the bootstrap can be sensitive to overparameterization - and this is a common problem with standard techniques. In this paper, we describe these common problems in using the bootstrap and how to spot them.

查看原文本刊更多论文

自举预测模型诊断性评估的必要性

从本质上讲，自举是一种获得统计量近似抽样分布的方法(因此，如果需要，可以产生置信区间)。如果该统计量是感兴趣的总体参数的合适估计量，则自举可以对该参数进行推断。在简单的情况下，自举的形式非常简单，但可以处理更复杂的情况。可以修改自举，以产生预测分布(因此，如果需要，预测区间)。预测分布通常是保险公司最感兴趣的(因为他们支付的是过程的结果，而不是平均值)。近年来，bootstrap在保留中变得相当流行，但有必要谨慎使用bootstrap。引导不要求用户假定数据是分布的。相反，抽样分布是通过对数据重新抽样得到的。然而，引导当然不能避免假设的需要，也不能避免检查这些假设的需要。自我引导远不是万灵药。它所面临的问题与用其他方法寻找统计数据的预测分布和抽样分布的问题本质上是一样的。这些问题由于预测问题的时间序列性质而更加严重- -因为储备需要预测到从未观测到的历年期间，因此在历年方向上的模式不足成为一个严重问题。特别是，最流行的精算技术——那些最常与自举一起使用的技术——在这个方向上没有任何参数，并且经常根据日历时间的行为指定错误。此外，常用的bootstrap版本可能对过度参数化很敏感——这是标准技术的一个常见问题。在本文中，我们描述了这些常见的问题在使用bootstrap和如何发现它们。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

UNSW Business School Research Paper Series

自引率

0.00%

发文量