Augmentation-aware self-supervised learning with conditioned projector

IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Marcin Przewięźlikowski , Mateusz Pyla , Bartosz Zieliński , Bartłomiej Twardowski , Jacek Tabor , Marek Śmieja
{"title":"Augmentation-aware self-supervised learning with conditioned projector","authors":"Marcin Przewięźlikowski ,&nbsp;Mateusz Pyla ,&nbsp;Bartosz Zieliński ,&nbsp;Bartłomiej Twardowski ,&nbsp;Jacek Tabor ,&nbsp;Marek Śmieja","doi":"10.1016/j.knosys.2024.112572","DOIUrl":null,"url":null,"abstract":"<div><div>Self-supervised learning (SSL) is a powerful technique for learning from unlabeled data. By learning to remain invariant to applied data augmentations, methods such as SimCLR and MoCo can reach quality on par with supervised approaches. However, this invariance may be detrimental for solving downstream tasks that depend on traits affected by augmentations used during pretraining, such as color. In this paper, we propose to foster sensitivity to such characteristics in the representation space by modifying the projector network, a common component of self-supervised architectures. Specifically, we supplement the projector with information about augmentations applied to images. For the projector to take advantage of this auxiliary conditioning when solving the SSL task, the feature extractor learns to preserve the augmentation information in its representations. Our approach, coined <strong>C</strong>onditional <strong>A</strong>ugmentation-aware <strong>S</strong>elf-<strong>s</strong>upervised <strong>Le</strong>arning (CASSLE), is directly applicable to typical joint-embedding SSL methods regardless of their objective functions. Moreover, it does not require major changes in the network architecture or prior knowledge of downstream tasks. In addition to an analysis of sensitivity towards different data augmentations, we conduct a series of experiments, which show that CASSLE improves over various SSL methods, reaching state-of-the-art performance in multiple downstream tasks. <span><span><sup>1</sup></span></span> <span><span><sup>2</sup></span></span> <span><span><sup>3</sup></span></span></div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124012061","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Self-supervised learning (SSL) is a powerful technique for learning from unlabeled data. By learning to remain invariant to applied data augmentations, methods such as SimCLR and MoCo can reach quality on par with supervised approaches. However, this invariance may be detrimental for solving downstream tasks that depend on traits affected by augmentations used during pretraining, such as color. In this paper, we propose to foster sensitivity to such characteristics in the representation space by modifying the projector network, a common component of self-supervised architectures. Specifically, we supplement the projector with information about augmentations applied to images. For the projector to take advantage of this auxiliary conditioning when solving the SSL task, the feature extractor learns to preserve the augmentation information in its representations. Our approach, coined Conditional Augmentation-aware Self-supervised Learning (CASSLE), is directly applicable to typical joint-embedding SSL methods regardless of their objective functions. Moreover, it does not require major changes in the network architecture or prior knowledge of downstream tasks. In addition to an analysis of sensitivity towards different data augmentations, we conduct a series of experiments, which show that CASSLE improves over various SSL methods, reaching state-of-the-art performance in multiple downstream tasks. 1 2 3
带条件投影仪的增强感知自我监督学习
自我监督学习(SSL)是一种从无标记数据中学习的强大技术。通过学习保持对应用数据增强的不变性,SimCLR 和 MoCo 等方法可以达到与监督方法相当的质量。然而,这种不变性可能不利于解决下游任务,因为这些任务依赖于受预训练时使用的增强所影响的特征,如颜色。在本文中,我们建议通过修改投影仪网络(自我监督架构的常见组件)来提高对表示空间中此类特征的敏感性。具体来说,我们为投影器补充了有关图像增强的信息。为了让投影仪在解决 SSL 任务时利用这种辅助条件,特征提取器要学会在其表征中保留增强信息。我们的方法被称为条件增强感知自监督学习(CASSLE),可直接应用于典型的联合嵌入式 SSL 方法,而无需考虑其目标函数。此外,它不需要对网络架构进行重大改动,也不需要事先了解下游任务。除了分析对不同数据增强的敏感性外,我们还进行了一系列实验,结果表明 CASSLE 比各种 SSL 方法都有改进,在多个下游任务中达到了最先进的性能。1 2 3
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Knowledge-Based Systems
Knowledge-Based Systems 工程技术-计算机:人工智能
CiteScore
14.80
自引率
12.50%
发文量
1245
审稿时长
7.8 months
期刊介绍: Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信