{"title":"Generative adversarial networks for generating RGB-D videos","authors":"Yuki Nakahira, K. Kawamoto","doi":"10.23919/APSIPA.2018.8659648","DOIUrl":null,"url":null,"abstract":"Generative adversarial networks(GANs) have been successfully applied for generating high quality natural images and have been extended to the generation of RGB videos and 3D volume data. In this paper we consider the task of generating RGB-D videos, which is less extensively studied and still challenging. We explore deep GAN architectures suitable for the task, and develop 4 GAN architectures based on existing video-based GANs. With a facial expression database, we experimentally find that an extended version of the motion and content decomposed GANs, known as MoCoGAN, provides the highest quality RGB-D videos. We discuss several applications of our GAN to content creation and data augmentation, and also discuss its potential applications in behavioral experiments.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"14 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/APSIPA.2018.8659648","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Generative adversarial networks(GANs) have been successfully applied for generating high quality natural images and have been extended to the generation of RGB videos and 3D volume data. In this paper we consider the task of generating RGB-D videos, which is less extensively studied and still challenging. We explore deep GAN architectures suitable for the task, and develop 4 GAN architectures based on existing video-based GANs. With a facial expression database, we experimentally find that an extended version of the motion and content decomposed GANs, known as MoCoGAN, provides the highest quality RGB-D videos. We discuss several applications of our GAN to content creation and data augmentation, and also discuss its potential applications in behavioral experiments.