{"title":"Style-transfer GANs for bridging the domain gap in synthetic pose estimator training","authors":"Pavel Rojtberg, Thomas Pollabauer, Arjan Kuijper","doi":"10.1109/AIVR50618.2020.00039","DOIUrl":null,"url":null,"abstract":"Given the dependency of current CNN architectures on a large training set, the possibility of using synthetic data is alluring as it allows generating a virtually infinite amount of labeled training data. However, producing such data is a nontrivial task as current CNN architectures are sensitive to the domain gap between real and synthetic data.We propose to adopt general-purpose GAN models for pixellevel image translation, allowing to formulate the domain gap itself as a learning problem. The obtained models are then used either during training or inference to bridge the domain gap. Here, we focus on training the single-stage YOLO6D [20] object pose estimator on synthetic CAD geometry only, where not even approximate surface information is available. When employing paired GAN models, we use an edge-based intermediate domain and introduce different mappings to represent the unknown surface properties.Our evaluation shows a considerable improvement in model performance when compared to a model trained with the same degree of domain randomization, while requiring only very little additional effort.","PeriodicalId":348199,"journal":{"name":"2020 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIVR50618.2020.00039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
Given the dependency of current CNN architectures on a large training set, the possibility of using synthetic data is alluring as it allows generating a virtually infinite amount of labeled training data. However, producing such data is a nontrivial task as current CNN architectures are sensitive to the domain gap between real and synthetic data.We propose to adopt general-purpose GAN models for pixellevel image translation, allowing to formulate the domain gap itself as a learning problem. The obtained models are then used either during training or inference to bridge the domain gap. Here, we focus on training the single-stage YOLO6D [20] object pose estimator on synthetic CAD geometry only, where not even approximate surface information is available. When employing paired GAN models, we use an edge-based intermediate domain and introduce different mappings to represent the unknown surface properties.Our evaluation shows a considerable improvement in model performance when compared to a model trained with the same degree of domain randomization, while requiring only very little additional effort.
鉴于当前的 CNN 架构依赖于大量的训练集,使用合成数据的可能性非常诱人,因为它可以生成几乎无限量的标注训练数据。我们建议采用通用 GAN 模型进行像素级图像转换,从而将领域差距本身作为一个学习问题。我们建议采用通用 GAN 模型来处理像素级图像转换问题,从而将领域差距本身表述为学习问题,然后在训练或推理过程中使用获得的模型来弥合领域差距。在此,我们将重点放在仅在合成 CAD 几何图形上训练单级 YOLO6D [20] 物体姿态估计器上,在这种情况下,甚至连近似表面信息都无法获得。在使用配对 GAN 模型时,我们使用基于边缘的中间域,并引入不同的映射来表示未知的表面属性。我们的评估结果表明,与使用相同程度的域随机化方法训练的模型相比,模型性能有了显著提高,而所需的额外工作却很少。