Michalis Lazarou , Sata Atito , Muhammad Awais , Josef Kittler
{"title":"Which images can be effectively learnt from self-supervised learning?","authors":"Michalis Lazarou , Sata Atito , Muhammad Awais , Josef Kittler","doi":"10.1016/j.patrec.2025.09.003","DOIUrl":null,"url":null,"abstract":"<div><div>Self-supervised learning has shown unprecedented success for learning expressive representations that can be used effectively to solve downstream tasks. However, while the impressive results of self-supervised learning are undeniable there is still a certain mystery regarding how self-supervised learning models learn, what features are they learning and most importantly which examples are hard to learn. Contrastive learning is one of the prominent lines of research in self-supervised learning, where a subcategory of methods relies on knowledge-distillation between a student network and a teacher network which is an exponentially moving average of the student, initially proposed by the seminal work of DINO. In this work we investigate models trained using this family of self-supervised methods and reveal certain properties about them. Specifically, we propose a novel perspective on understanding which examples and which classes are difficult to be learnt effectively during training through the lens of information theory.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"198 ","pages":"Pages 8-13"},"PeriodicalIF":3.3000,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865525003137","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Self-supervised learning has shown unprecedented success for learning expressive representations that can be used effectively to solve downstream tasks. However, while the impressive results of self-supervised learning are undeniable there is still a certain mystery regarding how self-supervised learning models learn, what features are they learning and most importantly which examples are hard to learn. Contrastive learning is one of the prominent lines of research in self-supervised learning, where a subcategory of methods relies on knowledge-distillation between a student network and a teacher network which is an exponentially moving average of the student, initially proposed by the seminal work of DINO. In this work we investigate models trained using this family of self-supervised methods and reveal certain properties about them. Specifically, we propose a novel perspective on understanding which examples and which classes are difficult to be learnt effectively during training through the lens of information theory.
期刊介绍:
Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition.
Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.