{"title":"机器学习系统中单个预测的可靠性估计:基于模型可靠性的方法","authors":"","doi":"10.1016/j.dss.2024.114305","DOIUrl":null,"url":null,"abstract":"<div><p>The conventional aggregated performance measure (i.e., mean squared error) with respect to the whole dataset would not provide desired safety and quality assurance for each individual prediction made by a machine learning model in risk-sensitive regression problems. In this paper, we propose an informative indicator <span><math><mi>ℛ</mi><mfenced><mi>x</mi></mfenced></math></span> to quantify model reliability for individual prediction (MRIP) for the purpose of safeguarding the usage of machine learning (ML) models in mission-critical applications. Specifically, we define the reliability of a ML model with respect to its prediction on each individual input <span><math><mi>x</mi></math></span> as the probability of the observed difference between the prediction of ML model and the actual observation falling within a small interval when the input <span><math><mi>x</mi></math></span> varies within a small range subject to a preset distance constraint, namely <span><math><mi>ℛ</mi><mfenced><mi>x</mi></mfenced><mo>=</mo><mi>P</mi><mfenced><mrow></mrow><mrow><msup><mi>y</mi><mo>∗</mo></msup><mo>−</mo><msup><mover><mi>y</mi><mo>̂</mo></mover><mo>∗</mo></msup></mrow><mrow><mspace></mspace><mo>≤</mo><mi>ε</mi></mrow><mrow><msup><mi>x</mi><mo>∗</mo></msup><mo>∈</mo><mi>B</mi><mfenced><mi>x</mi></mfenced></mrow></mfenced></math></span>, where <span><math><msup><mi>y</mi><mo>∗</mo></msup></math></span> denotes the observed target value for the input <span><math><msup><mi>x</mi><mo>∗</mo></msup><mo>,</mo></math></span> <span><math><msup><mover><mi>y</mi><mo>̂</mo></mover><mo>∗</mo></msup></math></span> denotes the model prediction for the input <span><math><msup><mi>x</mi><mo>∗</mo></msup></math></span>, and <span><math><msup><mi>x</mi><mo>∗</mo></msup></math></span> is an input in the neighborhood of <span><math><mi>x</mi></math></span> subject to the constraint <span><math><mi>B</mi><mfenced><mi>x</mi></mfenced><mo>=</mo><mfenced><mrow><mfenced><msup><mi>x</mi><mo>∗</mo></msup></mfenced><mspace></mspace><mfenced><mrow><msup><mi>x</mi><mo>∗</mo></msup><mo>−</mo><mi>x</mi></mrow></mfenced><mo>≤</mo><mi>δ</mi></mrow></mfenced></math></span>. The developed MRIP indicator <span><math><mi>ℛ</mi><mfenced><mi>x</mi></mfenced></math></span> provides a direct, objective, quantitative, and general-purpose measure of “reliability” or the probability of success of the ML model for each individual prediction by fully exploiting the local information associated with the input <span><math><mi>x</mi></math></span> and ML model. Next, to mitigate the intensive computational effort involved in MRIP estimation, we develop a two-stage ML-based framework to directly learn the relationship between <span><math><mi>x</mi></math></span> and its MRIP <span><math><mi>ℛ</mi><mfenced><mi>x</mi></mfenced></math></span>, thus enabling to provide the reliability estimate <span><math><mi>ℛ</mi><mfenced><mi>x</mi></mfenced></math></span> for any unseen input instantly. Thirdly, we propose an information gain-based approach to help determine a threshold value pertaing to <span><math><mi>ℛ</mi><mfenced><mi>x</mi></mfenced></math></span> in support of decision makings on when to accept or abstain from counting on the ML model prediction. Comprehensive computational experiments and quantitative comparisons with existing methods on a broad range of real-world datasets reveal that the developed ML-based framework for MRIP estimation shows a robust performance in improving the reliability estimate of individual prediction, and the MRIP indicator <span><math><mi>ℛ</mi><mfenced><mi>x</mi></mfenced></math></span> thus provides an essential layer of safety net when adopting ML models in risk-sensitive environments.</p></div>","PeriodicalId":55181,"journal":{"name":"Decision Support Systems","volume":null,"pages":null},"PeriodicalIF":6.7000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reliability estimation for individual predictions in machine learning systems: A model reliability-based approach\",\"authors\":\"\",\"doi\":\"10.1016/j.dss.2024.114305\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The conventional aggregated performance measure (i.e., mean squared error) with respect to the whole dataset would not provide desired safety and quality assurance for each individual prediction made by a machine learning model in risk-sensitive regression problems. In this paper, we propose an informative indicator <span><math><mi>ℛ</mi><mfenced><mi>x</mi></mfenced></math></span> to quantify model reliability for individual prediction (MRIP) for the purpose of safeguarding the usage of machine learning (ML) models in mission-critical applications. Specifically, we define the reliability of a ML model with respect to its prediction on each individual input <span><math><mi>x</mi></math></span> as the probability of the observed difference between the prediction of ML model and the actual observation falling within a small interval when the input <span><math><mi>x</mi></math></span> varies within a small range subject to a preset distance constraint, namely <span><math><mi>ℛ</mi><mfenced><mi>x</mi></mfenced><mo>=</mo><mi>P</mi><mfenced><mrow></mrow><mrow><msup><mi>y</mi><mo>∗</mo></msup><mo>−</mo><msup><mover><mi>y</mi><mo>̂</mo></mover><mo>∗</mo></msup></mrow><mrow><mspace></mspace><mo>≤</mo><mi>ε</mi></mrow><mrow><msup><mi>x</mi><mo>∗</mo></msup><mo>∈</mo><mi>B</mi><mfenced><mi>x</mi></mfenced></mrow></mfenced></math></span>, where <span><math><msup><mi>y</mi><mo>∗</mo></msup></math></span> denotes the observed target value for the input <span><math><msup><mi>x</mi><mo>∗</mo></msup><mo>,</mo></math></span> <span><math><msup><mover><mi>y</mi><mo>̂</mo></mover><mo>∗</mo></msup></math></span> denotes the model prediction for the input <span><math><msup><mi>x</mi><mo>∗</mo></msup></math></span>, and <span><math><msup><mi>x</mi><mo>∗</mo></msup></math></span> is an input in the neighborhood of <span><math><mi>x</mi></math></span> subject to the constraint <span><math><mi>B</mi><mfenced><mi>x</mi></mfenced><mo>=</mo><mfenced><mrow><mfenced><msup><mi>x</mi><mo>∗</mo></msup></mfenced><mspace></mspace><mfenced><mrow><msup><mi>x</mi><mo>∗</mo></msup><mo>−</mo><mi>x</mi></mrow></mfenced><mo>≤</mo><mi>δ</mi></mrow></mfenced></math></span>. The developed MRIP indicator <span><math><mi>ℛ</mi><mfenced><mi>x</mi></mfenced></math></span> provides a direct, objective, quantitative, and general-purpose measure of “reliability” or the probability of success of the ML model for each individual prediction by fully exploiting the local information associated with the input <span><math><mi>x</mi></math></span> and ML model. Next, to mitigate the intensive computational effort involved in MRIP estimation, we develop a two-stage ML-based framework to directly learn the relationship between <span><math><mi>x</mi></math></span> and its MRIP <span><math><mi>ℛ</mi><mfenced><mi>x</mi></mfenced></math></span>, thus enabling to provide the reliability estimate <span><math><mi>ℛ</mi><mfenced><mi>x</mi></mfenced></math></span> for any unseen input instantly. Thirdly, we propose an information gain-based approach to help determine a threshold value pertaing to <span><math><mi>ℛ</mi><mfenced><mi>x</mi></mfenced></math></span> in support of decision makings on when to accept or abstain from counting on the ML model prediction. Comprehensive computational experiments and quantitative comparisons with existing methods on a broad range of real-world datasets reveal that the developed ML-based framework for MRIP estimation shows a robust performance in improving the reliability estimate of individual prediction, and the MRIP indicator <span><math><mi>ℛ</mi><mfenced><mi>x</mi></mfenced></math></span> thus provides an essential layer of safety net when adopting ML models in risk-sensitive environments.</p></div>\",\"PeriodicalId\":55181,\"journal\":{\"name\":\"Decision Support Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.7000,\"publicationDate\":\"2024-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Decision Support Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167923624001386\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Decision Support Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167923624001386","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
在风险敏感回归问题中,传统的针对整个数据集的汇总性能指标(即均方误差)无法为机器学习模型所做的每个单独预测提供所需的安全和质量保证。在本文中,我们提出了一个信息指标ℛx 来量化单个预测的模型可靠性(MRIP),以保障机器学习(ML)模型在关键任务应用中的使用。具体来说,我们将 ML 模型对每个输入 x 的预测可靠性定义为:当输入 x 在一个小范围内变化时,ML 模型的预测值与实际观测值之间的差值落在一个小区间内的概率,该小区间受预设距离约束、即ℛx=Py∗-ŷ∗≤εx∗∈Bx,其中 y∗ 表示输入 x∗ 的观测目标值、ŷ∗ 表示输入 x∗ 的模型预测值,x∗ 是 x 附近的输入,受 Bx=x∗x∗-x≤δ 约束。所开发的 MRIP 指标ℛx 通过充分利用与输入 x 和 ML 模型相关的本地信息,为每个单独预测的 "可靠性 "或 ML 模型的成功概率提供了直接、客观、定量和通用的衡量标准。其次,为了减轻 MRIP 估计所需的大量计算工作,我们开发了一个基于 ML 的两阶段框架,直接学习 x 与其 MRIP ℛx 之间的关系,从而能够为任何未见输入即时提供可靠性估计ℛx。第三,我们提出了一种基于信息增益的方法,帮助确定ℛx 的阈值,以支持何时接受或放弃依赖 ML 模型预测的决策。在广泛的现实世界数据集上进行的综合计算实验以及与现有方法的定量比较表明,所开发的基于 ML 的 MRIP 估算框架在提高单个预测的可靠性估计方面表现出色,因此,当在风险敏感环境中采用 ML 模型时,MRIP 指标ℛx 提供了一层必不可少的安全网。
Reliability estimation for individual predictions in machine learning systems: A model reliability-based approach
The conventional aggregated performance measure (i.e., mean squared error) with respect to the whole dataset would not provide desired safety and quality assurance for each individual prediction made by a machine learning model in risk-sensitive regression problems. In this paper, we propose an informative indicator to quantify model reliability for individual prediction (MRIP) for the purpose of safeguarding the usage of machine learning (ML) models in mission-critical applications. Specifically, we define the reliability of a ML model with respect to its prediction on each individual input as the probability of the observed difference between the prediction of ML model and the actual observation falling within a small interval when the input varies within a small range subject to a preset distance constraint, namely , where denotes the observed target value for the input denotes the model prediction for the input , and is an input in the neighborhood of subject to the constraint . The developed MRIP indicator provides a direct, objective, quantitative, and general-purpose measure of “reliability” or the probability of success of the ML model for each individual prediction by fully exploiting the local information associated with the input and ML model. Next, to mitigate the intensive computational effort involved in MRIP estimation, we develop a two-stage ML-based framework to directly learn the relationship between and its MRIP , thus enabling to provide the reliability estimate for any unseen input instantly. Thirdly, we propose an information gain-based approach to help determine a threshold value pertaing to in support of decision makings on when to accept or abstain from counting on the ML model prediction. Comprehensive computational experiments and quantitative comparisons with existing methods on a broad range of real-world datasets reveal that the developed ML-based framework for MRIP estimation shows a robust performance in improving the reliability estimate of individual prediction, and the MRIP indicator thus provides an essential layer of safety net when adopting ML models in risk-sensitive environments.
期刊介绍:
The common thread of articles published in Decision Support Systems is their relevance to theoretical and technical issues in the support of enhanced decision making. The areas addressed may include foundations, functionality, interfaces, implementation, impacts, and evaluation of decision support systems (DSSs).