跨数据集迁移学习和伪标记的数据增强指南

Fernando Pereira dos Santos, Gabriela Thume, M. Ponti
{"title":"跨数据集迁移学习和伪标记的数据增强指南","authors":"Fernando Pereira dos Santos, Gabriela Thume, M. Ponti","doi":"10.1109/sibgrapi54419.2021.00036","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Networks require large amounts of labeled data in order to be trained. To improve such performances, a practical approach widely used is to augment the training set data, generating compatible data. Standard data augmentation for images includes conventional techniques, such as rotation, shift, and flip. In this paper, we go beyond such methods by studying alternative augmentation procedures for cross-dataset scenarios, in which a source dataset is used for training and a target dataset is used for testing. Through an extensive analysis considering different paradigms, saturation, and combination procedures, we provide guidelines for using augmentation methods in favor of transfer learning scenarios. As a novel approach for self-supervised learning, we also propose data augmentation techniques as pseudo labels during training. Our techniques demonstrate themselves as robust alternatives for different domains of transfer learning, including benefiting scenarios for self-supervised learning.","PeriodicalId":197423,"journal":{"name":"2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Data Augmentation Guidelines for Cross-Dataset Transfer Learning and Pseudo Labeling\",\"authors\":\"Fernando Pereira dos Santos, Gabriela Thume, M. Ponti\",\"doi\":\"10.1109/sibgrapi54419.2021.00036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional Neural Networks require large amounts of labeled data in order to be trained. To improve such performances, a practical approach widely used is to augment the training set data, generating compatible data. Standard data augmentation for images includes conventional techniques, such as rotation, shift, and flip. In this paper, we go beyond such methods by studying alternative augmentation procedures for cross-dataset scenarios, in which a source dataset is used for training and a target dataset is used for testing. Through an extensive analysis considering different paradigms, saturation, and combination procedures, we provide guidelines for using augmentation methods in favor of transfer learning scenarios. As a novel approach for self-supervised learning, we also propose data augmentation techniques as pseudo labels during training. Our techniques demonstrate themselves as robust alternatives for different domains of transfer learning, including benefiting scenarios for self-supervised learning.\",\"PeriodicalId\":197423,\"journal\":{\"name\":\"2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/sibgrapi54419.2021.00036\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/sibgrapi54419.2021.00036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

卷积神经网络需要大量的标记数据才能进行训练。为了提高这种性能,一种广泛使用的实用方法是增强训练集数据,生成兼容数据。图像的标准数据增强包括传统技术,如旋转、移位和翻转。在本文中,我们通过研究跨数据集场景的替代增强过程来超越这些方法,其中源数据集用于训练,目标数据集用于测试。通过考虑不同的范式、饱和和组合过程的广泛分析,我们提供了在迁移学习场景中使用增强方法的指导方针。作为一种新的自监督学习方法,我们还提出了数据增强技术作为训练过程中的伪标签。我们的技术证明了自己是不同迁移学习领域的健壮替代方案,包括自监督学习的有利场景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Data Augmentation Guidelines for Cross-Dataset Transfer Learning and Pseudo Labeling
Convolutional Neural Networks require large amounts of labeled data in order to be trained. To improve such performances, a practical approach widely used is to augment the training set data, generating compatible data. Standard data augmentation for images includes conventional techniques, such as rotation, shift, and flip. In this paper, we go beyond such methods by studying alternative augmentation procedures for cross-dataset scenarios, in which a source dataset is used for training and a target dataset is used for testing. Through an extensive analysis considering different paradigms, saturation, and combination procedures, we provide guidelines for using augmentation methods in favor of transfer learning scenarios. As a novel approach for self-supervised learning, we also propose data augmentation techniques as pseudo labels during training. Our techniques demonstrate themselves as robust alternatives for different domains of transfer learning, including benefiting scenarios for self-supervised learning.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信