基于隐空间属性的人脸图像处理

2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) Pub Date : 2021-11-16 DOI:10.1109/AVSS52988.2021.9663845

Chien-Hung Lin, Yiyun Pan, Ja-Ling Wu

{"title":"基于隐空间属性的人脸图像处理","authors":"Chien-Hung Lin, Yiyun Pan, Ja-Ling Wu","doi":"10.1109/AVSS52988.2021.9663845","DOIUrl":null,"url":null,"abstract":"Using machine learning to generate images has become more mature, especially the images produced using a Generative Adversarial Network. Unfortunately, the complicated architecture of those models makes it difficult for us to ensure the output images’ diversity and controllability without introducing little embarrassment in implementation. Therefore, some researchers try to edit the latent codes generated by a given learning model directly on the latent space for manipulating the output image by simply inputting the new latent codes into the original model without changing the model’s structure and learned parameters. However, the methods mentioned above faced the problems that the size of latent space cannot be too large or the trouble-some of features entanglement. In this work, we propose an approach to conquer the problems mentioned above, which is to compress the original latent space to better the applicability and usability of the methods limited by the size of the latent space. Compared with the existing methods, this method can be applied to more models and still reach the target of image manipulation.","PeriodicalId":246327,"journal":{"name":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Attribute-Based Facial Image Manipulation on Latent Space\",\"authors\":\"Chien-Hung Lin, Yiyun Pan, Ja-Ling Wu\",\"doi\":\"10.1109/AVSS52988.2021.9663845\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Using machine learning to generate images has become more mature, especially the images produced using a Generative Adversarial Network. Unfortunately, the complicated architecture of those models makes it difficult for us to ensure the output images’ diversity and controllability without introducing little embarrassment in implementation. Therefore, some researchers try to edit the latent codes generated by a given learning model directly on the latent space for manipulating the output image by simply inputting the new latent codes into the original model without changing the model’s structure and learned parameters. However, the methods mentioned above faced the problems that the size of latent space cannot be too large or the trouble-some of features entanglement. In this work, we propose an approach to conquer the problems mentioned above, which is to compress the original latent space to better the applicability and usability of the methods limited by the size of the latent space. Compared with the existing methods, this method can be applied to more models and still reach the target of image manipulation.\",\"PeriodicalId\":246327,\"journal\":{\"name\":\"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AVSS52988.2021.9663845\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AVSS52988.2021.9663845","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

使用机器学习生成图像已经变得更加成熟，特别是使用生成对抗网络生成的图像。不幸的是，这些模型的复杂架构使得我们很难在保证输出图像的多样性和可控性的同时，在实现上不带来一点尴尬。因此，一些研究人员试图在不改变模型结构和学习参数的情况下，将给定学习模型生成的潜码直接在潜空间上进行编辑，以便对输出图像进行操作。然而，上述方法都面临着潜在空间不能太大或特征纠缠的麻烦等问题。在这项工作中，我们提出了一种解决上述问题的方法，即压缩原始潜在空间，以提高受潜在空间大小限制的方法的适用性和可用性。与现有方法相比，该方法可以应用于更多的模型，并且仍然达到了图像处理的目标。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Attribute-Based Facial Image Manipulation on Latent Space

Using machine learning to generate images has become more mature, especially the images produced using a Generative Adversarial Network. Unfortunately, the complicated architecture of those models makes it difficult for us to ensure the output images’ diversity and controllability without introducing little embarrassment in implementation. Therefore, some researchers try to edit the latent codes generated by a given learning model directly on the latent space for manipulating the output image by simply inputting the new latent codes into the original model without changing the model’s structure and learned parameters. However, the methods mentioned above faced the problems that the size of latent space cannot be too large or the trouble-some of features entanglement. In this work, we propose an approach to conquer the problems mentioned above, which is to compress the original latent space to better the applicability and usability of the methods limited by the size of the latent space. Compared with the existing methods, this method can be applied to more models and still reach the target of image manipulation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)

自引率

0.00%

发文量