Data Augmentation Guidelines for Cross-Dataset Transfer Learning and Pseudo Labeling

2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) Pub Date : 2021-10-01 DOI:10.1109/sibgrapi54419.2021.00036

Fernando Pereira dos Santos, Gabriela Thume, M. Ponti

{"title":"Data Augmentation Guidelines for Cross-Dataset Transfer Learning and Pseudo Labeling","authors":"Fernando Pereira dos Santos, Gabriela Thume, M. Ponti","doi":"10.1109/sibgrapi54419.2021.00036","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Networks require large amounts of labeled data in order to be trained. To improve such performances, a practical approach widely used is to augment the training set data, generating compatible data. Standard data augmentation for images includes conventional techniques, such as rotation, shift, and flip. In this paper, we go beyond such methods by studying alternative augmentation procedures for cross-dataset scenarios, in which a source dataset is used for training and a target dataset is used for testing. Through an extensive analysis considering different paradigms, saturation, and combination procedures, we provide guidelines for using augmentation methods in favor of transfer learning scenarios. As a novel approach for self-supervised learning, we also propose data augmentation techniques as pseudo labels during training. Our techniques demonstrate themselves as robust alternatives for different domains of transfer learning, including benefiting scenarios for self-supervised learning.","PeriodicalId":197423,"journal":{"name":"2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/sibgrapi54419.2021.00036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Convolutional Neural Networks require large amounts of labeled data in order to be trained. To improve such performances, a practical approach widely used is to augment the training set data, generating compatible data. Standard data augmentation for images includes conventional techniques, such as rotation, shift, and flip. In this paper, we go beyond such methods by studying alternative augmentation procedures for cross-dataset scenarios, in which a source dataset is used for training and a target dataset is used for testing. Through an extensive analysis considering different paradigms, saturation, and combination procedures, we provide guidelines for using augmentation methods in favor of transfer learning scenarios. As a novel approach for self-supervised learning, we also propose data augmentation techniques as pseudo labels during training. Our techniques demonstrate themselves as robust alternatives for different domains of transfer learning, including benefiting scenarios for self-supervised learning.

查看原文本刊更多论文

跨数据集迁移学习和伪标记的数据增强指南

卷积神经网络需要大量的标记数据才能进行训练。为了提高这种性能，一种广泛使用的实用方法是增强训练集数据，生成兼容数据。图像的标准数据增强包括传统技术，如旋转、移位和翻转。在本文中，我们通过研究跨数据集场景的替代增强过程来超越这些方法，其中源数据集用于训练，目标数据集用于测试。通过考虑不同的范式、饱和和组合过程的广泛分析，我们提供了在迁移学习场景中使用增强方法的指导方针。作为一种新的自监督学习方法，我们还提出了数据增强技术作为训练过程中的伪标签。我们的技术证明了自己是不同迁移学习领域的健壮替代方案，包括自监督学习的有利场景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)

自引率

0.00%

发文量