Feature Normalization Prevents Collapse of Noncontrastive Learning Dynamics.

IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Han Bao
{"title":"Feature Normalization Prevents Collapse of Noncontrastive Learning Dynamics.","authors":"Han Bao","doi":"10.1162/neco.a.27","DOIUrl":null,"url":null,"abstract":"<p><p>Contrastive learning is a self-supervised representation learning framework where two positive views generated through data augmentation are made similar by an attraction force in a data representation space, while a repulsive force makes them far from negative examples. Noncontrastive learning, represented by BYOL and SimSiam, gets rid of negative examples and improves computational efficiency. While learned representations may collapse into a single point due to the lack of the repulsive force at first sight, Tian et al. (2021) revealed through learning dynamics analysis that the representations can avoid collapse if data augmentation is sufficiently stronger than regularization. However, their analysis does not take into account commonly used feature normalization, a normalizer before measuring the similarity of representations, and hence excessively strong regularization may still collapse the dynamics, an unnatural behavior under the presence of feature normalization. Therefore, we extend the previous theory based on the L2 loss by considering the cosine loss instead, which involves feature normalization. We show that the cosine loss induces sixth-order dynamics (while the L2 loss induces a third-order one), in which a stable equilibrium dynamically emerges even if there are only collapsed solutions with given initial parameters. Thus, we offer a new understanding that feature normalization plays an important role in robustly preventing the dynamics collapse.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":" ","pages":"2079-2124"},"PeriodicalIF":2.1000,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computation","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/neco.a.27","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Contrastive learning is a self-supervised representation learning framework where two positive views generated through data augmentation are made similar by an attraction force in a data representation space, while a repulsive force makes them far from negative examples. Noncontrastive learning, represented by BYOL and SimSiam, gets rid of negative examples and improves computational efficiency. While learned representations may collapse into a single point due to the lack of the repulsive force at first sight, Tian et al. (2021) revealed through learning dynamics analysis that the representations can avoid collapse if data augmentation is sufficiently stronger than regularization. However, their analysis does not take into account commonly used feature normalization, a normalizer before measuring the similarity of representations, and hence excessively strong regularization may still collapse the dynamics, an unnatural behavior under the presence of feature normalization. Therefore, we extend the previous theory based on the L2 loss by considering the cosine loss instead, which involves feature normalization. We show that the cosine loss induces sixth-order dynamics (while the L2 loss induces a third-order one), in which a stable equilibrium dynamically emerges even if there are only collapsed solutions with given initial parameters. Thus, we offer a new understanding that feature normalization plays an important role in robustly preventing the dynamics collapse.

特征归一化防止非对比学习动态的崩溃。
对比学习是一种自我监督的表征学习框架,通过数据增强生成的两个正面视图在数据表征空间中被引力使相似,而斥力使它们远离负面示例。以BYOL和SimSiam为代表的非对比学习也消除了负例,提高了计算效率。虽然学习到的表征可能会由于第一眼缺乏排斥力而崩溃为单点,但Tian等人(2021)通过学习动力学分析揭示,如果数据增强比正则化足够强,表征可以避免崩溃。然而,他们的分析并没有考虑到常用的特征归一化,在测量表征的相似性之前使用一个归一化器,因此过于强烈的正则化可能仍然会破坏动态,这是特征归一化存在下的一种不自然的行为。因此,我们通过考虑余弦损失来扩展先前基于L2损失的理论,这涉及到特征归一化。我们证明了余弦损失诱导六阶动力学(而L2损失诱导三阶动力学),其中即使只有具有给定初始参数的崩塌解,也会动态出现稳定的平衡。因此,我们提供了一个新的认识,即特征归一化在鲁棒性防止动态崩溃中起着重要作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Neural Computation
Neural Computation 工程技术-计算机:人工智能
CiteScore
6.30
自引率
3.40%
发文量
83
审稿时长
3.0 months
期刊介绍: Neural Computation is uniquely positioned at the crossroads between neuroscience and TMCS and welcomes the submission of original papers from all areas of TMCS, including: Advanced experimental design; Analysis of chemical sensor data; Connectomic reconstructions; Analysis of multielectrode and optical recordings; Genetic data for cell identity; Analysis of behavioral data; Multiscale models; Analysis of molecular mechanisms; Neuroinformatics; Analysis of brain imaging data; Neuromorphic engineering; Principles of neural coding, computation, circuit dynamics, and plasticity; Theories of brain function.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信