H-Calibration: Rethinking Classifier Recalibration With Probabilistic Error-Bounded Objective

IF 18.6

IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-06-24 DOI:10.1109/TPAMI.2025.3582796

Wenjian Huang;Guiping Cao;Jiahao Xia;Jingkun Chen;Hao Wang;Jianguo Zhang

{"title":"H-Calibration: Rethinking Classifier Recalibration With Probabilistic Error-Bounded Objective","authors":"Wenjian Huang;Guiping Cao;Jiahao Xia;Jingkun Chen;Hao Wang;Jianguo Zhang","doi":"10.1109/TPAMI.2025.3582796","DOIUrl":null,"url":null,"abstract":"Deep neural networks have demonstrated remarkable performance across numerous learning tasks but often suffer from miscalibration, resulting in unreliable probability outputs. This has inspired many recent works on mitigating miscalibration, particularly through post-hoc recalibration methods that aim to obtain calibrated probabilities without sacrificing the classification performance of pre-trained models. In this study, we summarize and categorize previous works into three general strategies: intuitively designed methods, binning-based methods, and methods based on formulations of ideal calibration. Through theoretical and practical analysis, we highlight ten common limitations in previous approaches. To address these limitations, we propose a probabilistic learning framework for calibration called <inline-formula><tex-math>$h$</tex-math></inline-formula>-calibration, which theoretically constructs an equivalent learning formulation for canonical calibration with boundedness. On this basis, we design a simple yet effective post-hoc calibration algorithm. Our method not only overcomes the ten identified limitations but also achieves markedly better performance than traditional methods, as validated by extensive experiments. We further analyze, both theoretically and experimentally, the relationship and advantages of our learning objective compared to traditional proper scoring rule. In summary, our probabilistic framework derives an approximately equivalent differentiable objective for learning error-bounded calibrated probabilities, elucidating the correspondence and convergence properties of computational statistics with respect to theoretical bounds in canonical calibration. The theoretical effectiveness is verified on standard post-hoc calibration benchmarks by achieving state-of-the-art performance. This research offers valuable reference for learning reliable likelihood in related fields.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 10","pages":"9023-9042"},"PeriodicalIF":18.6000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11049017/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Deep neural networks have demonstrated remarkable performance across numerous learning tasks but often suffer from miscalibration, resulting in unreliable probability outputs. This has inspired many recent works on mitigating miscalibration, particularly through post-hoc recalibration methods that aim to obtain calibrated probabilities without sacrificing the classification performance of pre-trained models. In this study, we summarize and categorize previous works into three general strategies: intuitively designed methods, binning-based methods, and methods based on formulations of ideal calibration. Through theoretical and practical analysis, we highlight ten common limitations in previous approaches. To address these limitations, we propose a probabilistic learning framework for calibration called

$h$

-calibration, which theoretically constructs an equivalent learning formulation for canonical calibration with boundedness. On this basis, we design a simple yet effective post-hoc calibration algorithm. Our method not only overcomes the ten identified limitations but also achieves markedly better performance than traditional methods, as validated by extensive experiments. We further analyze, both theoretically and experimentally, the relationship and advantages of our learning objective compared to traditional proper scoring rule. In summary, our probabilistic framework derives an approximately equivalent differentiable objective for learning error-bounded calibrated probabilities, elucidating the correspondence and convergence properties of computational statistics with respect to theoretical bounds in canonical calibration. The theoretical effectiveness is verified on standard post-hoc calibration benchmarks by achieving state-of-the-art performance. This research offers valuable reference for learning reliable likelihood in related fields.

查看原文本刊更多论文

h-校准：基于概率误差有界目标的分类器再校准

深度神经网络在许多学习任务中表现出卓越的性能，但经常受到校准错误的影响，导致不可靠的概率输出。这启发了最近许多关于减轻误校准的工作，特别是通过事后重新校准方法，旨在在不牺牲预训练模型的分类性能的情况下获得校准概率。在本研究中，我们将以往的工作总结并归类为三种一般策略：直观设计方法、基于分类的方法和基于理想校准公式的方法。通过理论和实践分析，我们强调了以往方法的十个共同局限性。为了解决这些限制，我们提出了一个称为$h$-calibration的概率学习框架，该框架从理论上构建了一个等效的有界正则校准学习公式。在此基础上，设计了一种简单有效的事后标定算法。通过大量的实验验证，该方法不仅克服了已知的十个局限性，而且取得了明显优于传统方法的性能。进一步从理论和实验两方面分析了该学习目标与传统适当评分规则的关系和优势。总之，我们的概率框架导出了一个近似等效的可微目标，用于学习误差有界的校准概率，阐明了计算统计在规范校准中关于理论界限的对应性和收敛性。通过实现最先进的性能，在标准的事后校准基准上验证了理论有效性。本研究为相关领域的可靠似然学习提供了有价值的参考。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量