Beyond Distribution Shift: Spurious Features Through the Lens of Training Dynamics.

Transactions on machine learning research Pub Date : 2023-10-01

Nihal Murali, Aahlad Puli, Ke Yu, Rajesh Ranganath, Kayhan Batmanghelich

{"title":"Beyond Distribution Shift: Spurious Features Through the Lens of Training Dynamics.","authors":"Nihal Murali, Aahlad Puli, Ke Yu, Rajesh Ranganath, Kayhan Batmanghelich","doi":"","DOIUrl":null,"url":null,"abstract":"Deep Neural Networks (DNNs) are prone to learning spurious features that correlate with the label during training but are irrelevant to the learning problem. This hurts model generalization and poses problems when deploying them in safety-critical applications. This paper aims to better understand the effects of spurious features through the lens of the learning dynamics of the internal neurons during the training process. We make the following observations: (1) While previous works highlight the harmful effects of spurious features on the generalization ability of DNNs, we emphasize that not all spurious features are harmful. Spurious features can be \"benign\" or \"harmful\" depending on whether they are \"harder\" or \"easier\" to learn than the core features for a given model. This definition is model and dataset dependent. (2) We build upon this premise and use instance difficulty methods (like Prediction Depth (Baldock et al., 2021)) to quantify \"easiness\" for a given model and to identify this behavior during the training phase. (3) We empirically show that the harmful spurious features can be detected by observing the learning dynamics of the DNN's early layers. In other words, easy features learned by the initial layers of a DNN early during the training can (potentially) hurt model generalization. We verify our claims on medical and vision datasets, both simulated and real, and justify the empirical success of our hypothesis by showing the theoretical connections between Prediction Depth and information-theoretic concepts like <math><mi>𝒱</mi></math>-usable information (Ethayarajh et al., 2021). Lastly, our experiments show that monitoring only accuracy during training (as is common in machine learning pipelines) is insufficient to detect spurious features. We, therefore, highlight the need for monitoring early training dynamics using suitable instance difficulty metrics.","PeriodicalId":75238,"journal":{"name":"Transactions on machine learning research","volume":"2023 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11029547/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions on machine learning research","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Deep Neural Networks (DNNs) are prone to learning spurious features that correlate with the label during training but are irrelevant to the learning problem. This hurts model generalization and poses problems when deploying them in safety-critical applications. This paper aims to better understand the effects of spurious features through the lens of the learning dynamics of the internal neurons during the training process. We make the following observations: (1) While previous works highlight the harmful effects of spurious features on the generalization ability of DNNs, we emphasize that not all spurious features are harmful. Spurious features can be "benign" or "harmful" depending on whether they are "harder" or "easier" to learn than the core features for a given model. This definition is model and dataset dependent. (2) We build upon this premise and use instance difficulty methods (like Prediction Depth (Baldock et al., 2021)) to quantify "easiness" for a given model and to identify this behavior during the training phase. (3) We empirically show that the harmful spurious features can be detected by observing the learning dynamics of the DNN's early layers. In other words, easy features learned by the initial layers of a DNN early during the training can (potentially) hurt model generalization. We verify our claims on medical and vision datasets, both simulated and real, and justify the empirical success of our hypothesis by showing the theoretical connections between Prediction Depth and information-theoretic concepts like $𝒱$ -usable information (Ethayarajh et al., 2021). Lastly, our experiments show that monitoring only accuracy during training (as is common in machine learning pipelines) is insufficient to detect spurious features. We, therefore, highlight the need for monitoring early training dynamics using suitable instance difficulty metrics.

本刊更多论文

超越分布偏移：从训练动态的角度看虚假特征。

深度神经网络（DNN）在训练过程中容易学习到与标签相关但与学习问题无关的虚假特征。这会损害模型的泛化，在安全关键型应用中部署它们时会遇到问题。本文旨在通过内部神经元在训练过程中的学习动态，更好地理解虚假特征的影响。我们提出以下几点看法：（1）虽然之前的研究强调了虚假特征对 DNN 泛化能力的有害影响，但我们强调并非所有虚假特征都是有害的。虚假特征可以是 "良性 "的，也可以是 "有害 "的，这取决于它们比给定模型的核心特征 "更难 "学习还是 "更容易 "学习。这一定义取决于模型和数据集。(2) 在此基础上，我们使用实例难度方法（如 Prediction Depth，Baldock 等人，2021 年）来量化给定模型的 "易学性"，并在训练阶段识别这种行为。(3) 我们通过经验证明，通过观察 DNN 早期层的学习动态，可以检测出有害的虚假特征。换句话说，DNN 初始层在训练初期学习到的简单特征（可能）会损害模型的泛化。我们在医学和视觉数据集（包括模拟和真实数据集）上验证了我们的说法，并通过展示预测深度和信息论概念（如𝒱-usable information）之间的理论联系（Ethayarajh 等人，2021 年）来证明我们的假设在经验上是成功的。最后，我们的实验表明，在训练过程中只监控准确率（这在机器学习管道中很常见）不足以检测到虚假特征。因此，我们强调需要使用合适的实例难度指标来监控早期的训练动态。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Transactions on machine learning research

自引率

0.00%

发文量