Deconstructing deep imbalanced regression: a comprehensive review and experimental evaluation

IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Artificial Intelligence Review Pub Date : 2026-04-22 Epub Date: 2026-04-29 DOI:10.1007/s10462-026-11570-1
Noah C. Puetz, Jens U. Brandt, Marc Hilbert, Elena Raponi, Thomas Bäck, Thomas Bartz-Beielstein
{"title":"Deconstructing deep imbalanced regression: a comprehensive review and experimental evaluation","authors":"Noah C. Puetz,&nbsp;Jens U. Brandt,&nbsp;Marc Hilbert,&nbsp;Elena Raponi,&nbsp;Thomas Bäck,&nbsp;Thomas Bartz-Beielstein","doi":"10.1007/s10462-026-11570-1","DOIUrl":null,"url":null,"abstract":"<div><p>In real-world applications, there is a fundamental problem: the data most critical to predict interesting events, anomalies, and high-stakes outliers are the rarest, while less interesting data is abundant. Although deep learning is deployed specifically for these difficult prediction tasks, data-driven models inevitably fail in underrepresented areas. This discrepancy between the empirical data- and the desired evaluation distribution is equivalent to a target distribution shift. The research field, termed Deep Imbalanced Regression (DIR), has emerged explicitly to address this challenge, which is particularly acute for continuous targets where most conventional classification-based methods are ill-suited. In this paper, we present the first comprehensive review of the DIR landscape, organized around a novel two-axis taxonomy that disentangles challenges along a <i>Data Axis</i> (target distribution shift, continuity, and density) and a <i>Deep-Learning Axis</i> (shared capacity, biased updates, and manifold distortion), where the latter captures a cascading failure mechanism through which deep models systematically neglect underrepresented targets. Within this framework, we systematically categorize and analyze 19 state-of-the-art methods spanning architectural, algorithm-level, and representation learning approaches, and empirically re-evaluate twelve of them with publicly available implementations under controlled, identical conditions. To stress-test generalization across the full target range, we introduce three novel targeted evaluation protocols, <i>Balanced Extrapolation</i>, <i>Bimodal Interpolation</i>, and <i>Blind-Spot Isolation</i>, that expose failure modes hidden by standard benchmarks (https://github.com/noah-puetz/deconstructing_deep_imbalanced_regression). Our study underscores the significant impact of imbalance on regression accuracy, offering a conceptual framework and practical benchmarks to catalyze further development of systems capable of capturing the rare as reliably as the common.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 6","pages":""},"PeriodicalIF":13.9000,"publicationDate":"2026-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-026-11570-1.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-026-11570-1","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/4/29 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In real-world applications, there is a fundamental problem: the data most critical to predict interesting events, anomalies, and high-stakes outliers are the rarest, while less interesting data is abundant. Although deep learning is deployed specifically for these difficult prediction tasks, data-driven models inevitably fail in underrepresented areas. This discrepancy between the empirical data- and the desired evaluation distribution is equivalent to a target distribution shift. The research field, termed Deep Imbalanced Regression (DIR), has emerged explicitly to address this challenge, which is particularly acute for continuous targets where most conventional classification-based methods are ill-suited. In this paper, we present the first comprehensive review of the DIR landscape, organized around a novel two-axis taxonomy that disentangles challenges along a Data Axis (target distribution shift, continuity, and density) and a Deep-Learning Axis (shared capacity, biased updates, and manifold distortion), where the latter captures a cascading failure mechanism through which deep models systematically neglect underrepresented targets. Within this framework, we systematically categorize and analyze 19 state-of-the-art methods spanning architectural, algorithm-level, and representation learning approaches, and empirically re-evaluate twelve of them with publicly available implementations under controlled, identical conditions. To stress-test generalization across the full target range, we introduce three novel targeted evaluation protocols, Balanced Extrapolation, Bimodal Interpolation, and Blind-Spot Isolation, that expose failure modes hidden by standard benchmarks (https://github.com/noah-puetz/deconstructing_deep_imbalanced_regression). Our study underscores the significant impact of imbalance on regression accuracy, offering a conceptual framework and practical benchmarks to catalyze further development of systems capable of capturing the rare as reliably as the common.

解构深度不平衡回归:综合回顾与实验评价
在现实世界的应用程序中,存在一个基本问题:对于预测有趣的事件、异常和高风险异常值最关键的数据是最罕见的,而不那么有趣的数据则非常丰富。尽管深度学习是专门为这些困难的预测任务而部署的,但数据驱动模型不可避免地会在代表性不足的领域失败。经验数据与期望的评估分布之间的这种差异相当于目标分布的转移。研究领域,称为深度不平衡回归(DIR),已经明确出现,以解决这一挑战,这是特别尖锐的连续目标,大多数传统的基于分类的方法是不适合的。在本文中,我们提出了对DIR景观的第一次全面回顾,围绕一个新的双轴分类法进行组织,该分类法沿着数据轴(目标分布转移、连续性和密度)和深度学习轴(共享容量、有偏差更新和流形失真)解决挑战,后者捕获了级联失效机制,通过该机制,深度模型系统地忽略了代表性不足的目标。在这个框架内,我们系统地分类和分析了19种最先进的方法,包括架构、算法级和表示学习方法,并在受控的、相同的条件下,用公开可用的实现对其中的12种方法进行了经验性的重新评估。为了在整个目标范围内进行压力测试泛化,我们引入了三种新的有针对性的评估方案,即平衡外推法、双峰插值法和盲点隔离法,这些方案暴露了标准基准所隐藏的故障模式(https://github.com/noah-puetz/deconstructing_deep_imbalanced_regression)。我们的研究强调了不平衡对回归精度的重大影响,提供了一个概念框架和实践基准,以催化进一步开发能够捕获稀有和常见的系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Artificial Intelligence Review
Artificial Intelligence Review 工程技术-计算机:人工智能
CiteScore
22.00
自引率
3.30%
发文量
194
审稿时长
5.3 months
期刊介绍: Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书