Efficient conformal regressors using bagged neural nets

2015 International Joint Conference on Neural Networks (IJCNN) Pub Date : 2015-10-01 DOI:10.1109/IJCNN.2015.7280763

U. Johansson, Cecilia Sönströd, H. Linusson

{"title":"Efficient conformal regressors using bagged neural nets","authors":"U. Johansson, Cecilia Sönströd, H. Linusson","doi":"10.1109/IJCNN.2015.7280763","DOIUrl":null,"url":null,"abstract":"Conformal predictors use machine learning models to output prediction sets. For regression, a prediction set is simply a prediction interval. All conformal predictors are valid, meaning that the error rate on novel data is bounded by a preset significance level. The key performance metric for conformal predictors is their efficiency, i.e., the size of the prediction sets. Inductive conformal predictors utilize real-valued functions, called nonconformity functions, and a calibration set, i.e., a set of labeled instances not used for the model training, to obtain the prediction regions. In state-of-the-art conformal regressors, the nonconformity functions are normalized, i.e., they include a component estimating the difficulty of each instance. In this study, conformal regressors are built on top of ensembles of bagged neural networks, and several nonconformity functions are evaluated. In addition, the option to calibrate on out-of-bag instances instead of setting aside a calibration set is investigated. The experiments, using 33 publicly available data sets, show that normalized nonconformity functions can produce smaller prediction sets, but the efficiency is highly dependent on the quality of the difficulty estimation. Specifically, in this study, the most efficient normalized nonconformity function estimated the difficulty of an instance by calculating the average error of neighboring instances. These results are consistent with previous studies using random forests as underlying models. Calibrating on out-of-bag did, however, only lead to more efficient conformal predictors on smaller data sets, which is in sharp contrast to the random forest study, where out-out-of bag calibration was significantly better overall.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"7 1","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN.2015.7280763","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Conformal predictors use machine learning models to output prediction sets. For regression, a prediction set is simply a prediction interval. All conformal predictors are valid, meaning that the error rate on novel data is bounded by a preset significance level. The key performance metric for conformal predictors is their efficiency, i.e., the size of the prediction sets. Inductive conformal predictors utilize real-valued functions, called nonconformity functions, and a calibration set, i.e., a set of labeled instances not used for the model training, to obtain the prediction regions. In state-of-the-art conformal regressors, the nonconformity functions are normalized, i.e., they include a component estimating the difficulty of each instance. In this study, conformal regressors are built on top of ensembles of bagged neural networks, and several nonconformity functions are evaluated. In addition, the option to calibrate on out-of-bag instances instead of setting aside a calibration set is investigated. The experiments, using 33 publicly available data sets, show that normalized nonconformity functions can produce smaller prediction sets, but the efficiency is highly dependent on the quality of the difficulty estimation. Specifically, in this study, the most efficient normalized nonconformity function estimated the difficulty of an instance by calculating the average error of neighboring instances. These results are consistent with previous studies using random forests as underlying models. Calibrating on out-of-bag did, however, only lead to more efficient conformal predictors on smaller data sets, which is in sharp contrast to the random forest study, where out-out-of bag calibration was significantly better overall.

查看原文本刊更多论文

使用袋装神经网络的高效共形回归

共形预测器使用机器学习模型输出预测集。对于回归，预测集只是一个预测区间。所有适形预测都是有效的，这意味着新数据的错误率受到预设显著性水平的限制。适形预测器的关键性能指标是它们的效率，即预测集的大小。归纳共形预测器利用实值函数(称为不符合函数)和校准集(即一组未用于模型训练的标记实例)来获得预测区域。在最先进的共形回归器中，不符合函数是归一化的，即，它们包括一个估计每个实例难度的分量。在本研究中，在袋装神经网络集合的基础上建立共形回归量，并对几种不符合函数进行评估。此外，还研究了在包外实例上进行校准而不是设置校准集的选项。使用33个公开数据集的实验表明，归一化的不一致性函数可以产生较小的预测集，但效率高度依赖于难度估计的质量。具体来说，在本研究中，最有效的归一化不符合函数通过计算相邻实例的平均误差来估计实例的困难度。这些结果与先前使用随机森林作为基础模型的研究一致。然而，在包外校准确实只能在较小的数据集上产生更有效的适形预测，这与随机森林研究形成鲜明对比，随机森林研究的包外校准总体上要好得多。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 International Joint Conference on Neural Networks (IJCNN)

自引率

0.00%

发文量