{"title":"Jigsaw Self-Supervised Visual Representation Learning: An Applied Comparative Analysis Study","authors":"Yomna A. Kawashti, D. Khattab, M. Aref","doi":"10.1109/MIUCC55081.2022.9781725","DOIUrl":null,"url":null,"abstract":"Self-supervised learning has been gaining momentum in the computer vision community as a hopeful contender to replace supervised learning. It aims to leverage unlabeled data by training a network on a proxy task and using transfer learning for a downstream task. Jigsaw is one of the proxy tasks used for learning better feature representations in self-supervised learning. In this work, we comparably evaluated the transferability of jigsaw using different architectures and a different dataset for jigsaw training. The features extracted from each convolutional block were evaluated using a unified downstream task. The best performance was achieved by the shallower architecture of AlexNet where the 2nd block achieved better transferability with a mean average precision of 36.17. We conclude that this behavior could be attributed to the smaller scale of our used dataset, so features extracted from earlier and shallower blocks had higher transferability to a dataset of a different domain.","PeriodicalId":105666,"journal":{"name":"2022 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MIUCC55081.2022.9781725","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Self-supervised learning has been gaining momentum in the computer vision community as a hopeful contender to replace supervised learning. It aims to leverage unlabeled data by training a network on a proxy task and using transfer learning for a downstream task. Jigsaw is one of the proxy tasks used for learning better feature representations in self-supervised learning. In this work, we comparably evaluated the transferability of jigsaw using different architectures and a different dataset for jigsaw training. The features extracted from each convolutional block were evaluated using a unified downstream task. The best performance was achieved by the shallower architecture of AlexNet where the 2nd block achieved better transferability with a mean average precision of 36.17. We conclude that this behavior could be attributed to the smaller scale of our used dataset, so features extracted from earlier and shallower blocks had higher transferability to a dataset of a different domain.