Reliability evaluation of individual predictions: a data-centric approach

Nima Shahbazi, Abolfazl Asudeh
{"title":"Reliability evaluation of individual predictions: a data-centric approach","authors":"Nima Shahbazi, Abolfazl Asudeh","doi":"10.1007/s00778-024-00857-w","DOIUrl":null,"url":null,"abstract":"<p>Machine learning models only provide probabilistic guarantees on the expected loss of random samples from the distribution represented by their training data. As a result, a model with high accuracy, may or may not be reliable for predicting an individual query point. To address this issue, XAI aims to provide explanations of individual predictions, while approaches such as conformal predictions, probabilistic predictions, and prediction intervals count on the model’s certainty in its prediction to identify unreliable cases. Conversely, instead of relying on the model itself, we look for insights in the training data. That is, following the fact a model’s performance is limited to the data it has been trained on, we ask “<i>is a model trained on a given data set, fit for making a specific prediction?</i>”. Specifically, we argue that a model’s prediction is not reliable if (i) there were not enough similar instances in the training set to the query point, and (ii) if there is a high fluctuation (uncertainty) in the vicinity of the query point in the training set. Using these two observations, we propose data-centric reliability measures for individual predictions and develop novel algorithms for efficient and effective computation of the reliability measures during inference time. The proposed algorithms learn the necessary components of the measures from the data itself and are sublinear, which makes them scalable to very large and multi-dimensional settings. Furthermore, an estimator is designed to enable no-data access during the inference time. We conduct extensive experiments using multiple real and synthetic data sets and different tasks, which reflect a consistent correlation between distrust values and model performance.</p>","PeriodicalId":501532,"journal":{"name":"The VLDB Journal","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The VLDB Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00778-024-00857-w","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning models only provide probabilistic guarantees on the expected loss of random samples from the distribution represented by their training data. As a result, a model with high accuracy, may or may not be reliable for predicting an individual query point. To address this issue, XAI aims to provide explanations of individual predictions, while approaches such as conformal predictions, probabilistic predictions, and prediction intervals count on the model’s certainty in its prediction to identify unreliable cases. Conversely, instead of relying on the model itself, we look for insights in the training data. That is, following the fact a model’s performance is limited to the data it has been trained on, we ask “is a model trained on a given data set, fit for making a specific prediction?”. Specifically, we argue that a model’s prediction is not reliable if (i) there were not enough similar instances in the training set to the query point, and (ii) if there is a high fluctuation (uncertainty) in the vicinity of the query point in the training set. Using these two observations, we propose data-centric reliability measures for individual predictions and develop novel algorithms for efficient and effective computation of the reliability measures during inference time. The proposed algorithms learn the necessary components of the measures from the data itself and are sublinear, which makes them scalable to very large and multi-dimensional settings. Furthermore, an estimator is designed to enable no-data access during the inference time. We conduct extensive experiments using multiple real and synthetic data sets and different tasks, which reflect a consistent correlation between distrust values and model performance.

Abstract Image

单项预测的可靠性评估:以数据为中心的方法
机器学习模型只能从概率上保证其训练数据所代表的分布中随机样本的预期损失。因此,准确率高的模型在预测单个查询点时可能可靠,也可能不可靠。为了解决这个问题,XAI 的目标是提供对单个预测的解释,而保形预测、概率预测和预测区间等方法则依靠模型预测的确定性来识别不可靠的情况。相反,我们不依赖模型本身,而是从训练数据中寻找启示。也就是说,根据模型的性能仅限于其训练数据这一事实,我们会问 "在给定数据集上训练的模型是否适合进行特定预测?具体来说,我们认为,如果(i) 训练集中没有足够的与查询点类似的实例,(ii) 训练集中查询点附近的波动(不确定性)很大,那么模型的预测就不可靠。利用这两点,我们提出了以数据为中心的单项预测可靠性度量,并开发了新型算法,用于在推理过程中高效计算可靠性度量。所提出的算法可从数据本身学习可靠度量的必要组成部分,并且是亚线性的,因此可扩展到非常大的多维环境。此外,我们还设计了一种估计器,使推理过程中无需访问数据。我们使用多个真实和合成数据集以及不同的任务进行了广泛的实验,实验结果反映出不信任值与模型性能之间存在一致的相关性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信