Controllable image generation and manipulation

I. Patras
{"title":"Controllable image generation and manipulation","authors":"I. Patras","doi":"10.1145/3592572.3596476","DOIUrl":null,"url":null,"abstract":"Recent years have witnessed an unprecedented interest in developing Deep Learning methodologies for the generation of images and image sequences that are hardly distinguishable to the human eye from real ones. A major issue in this field is how the generation can be easily controlled. In this talk we will focus on some of our recent works in generative models that are primarily aimed at controllable generation. We will first present unsupervised methods for learning non-linear paths in the latent spaces of Generative Adversarial Networks such that following different paths lead to different types of changes (e.g., removing the background, changing head poses, or facial expressions) in the resulting images [4]. Subsequently, we will present a method that allows local editing by finding a Parts and Appearances decomposition in the GAN latent space [2]. Then, we will present recent works on reenactment [1], where the goal is to transfer the facial activity (pose, expressions, speech) of a certain person to another one, and recent works in which supervision for generation comes from language models [3]. Finally, we will touch on the technical challenges ahead, as well on the challenges that this creates in spreading misinformation.","PeriodicalId":239252,"journal":{"name":"Proceedings of the 2nd ACM International Workshop on Multimedia AI against Disinformation","volume":"73 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd ACM International Workshop on Multimedia AI against Disinformation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3592572.3596476","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recent years have witnessed an unprecedented interest in developing Deep Learning methodologies for the generation of images and image sequences that are hardly distinguishable to the human eye from real ones. A major issue in this field is how the generation can be easily controlled. In this talk we will focus on some of our recent works in generative models that are primarily aimed at controllable generation. We will first present unsupervised methods for learning non-linear paths in the latent spaces of Generative Adversarial Networks such that following different paths lead to different types of changes (e.g., removing the background, changing head poses, or facial expressions) in the resulting images [4]. Subsequently, we will present a method that allows local editing by finding a Parts and Appearances decomposition in the GAN latent space [2]. Then, we will present recent works on reenactment [1], where the goal is to transfer the facial activity (pose, expressions, speech) of a certain person to another one, and recent works in which supervision for generation comes from language models [3]. Finally, we will touch on the technical challenges ahead, as well on the challenges that this creates in spreading misinformation.
可控图像生成和处理
近年来,人们对开发深度学习方法产生的图像和图像序列产生了前所未有的兴趣,这些图像和图像序列很难被人眼与真实图像区分开来。该领域的一个主要问题是如何容易地控制生成。在这次演讲中,我们将重点介绍我们最近在生成模型方面的一些工作,这些模型主要针对可控生成。我们将首先介绍用于学习生成对抗网络潜在空间中的非线性路径的无监督方法,这样,在生成的图像中,遵循不同的路径会导致不同类型的变化(例如,去除背景,改变头部姿势或面部表情)[4]。随后,我们将提出一种方法,通过在GAN潜在空间中找到零件和外观分解来进行局部编辑[2]。然后,我们将介绍最近关于再现的作品[1],其目标是将某个人的面部活动(姿势、表情、言语)转移到另一个人身上,以及最近来自语言模型的生成监督的作品[3]。最后,我们将谈到未来的技术挑战,以及这在传播错误信息方面所带来的挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信