Generation With Nuanced Changes: Continuous Image-to-Image Translation With Adversarial Preferences

IEEE transactions on artificial intelligence Pub Date : 2024-11-13 DOI:10.1109/TAI.2024.3497915

Yinghua Yao;Yuangang Pan;Ivor W. Tsang;Xin Yao

{"title":"Generation With Nuanced Changes: Continuous Image-to-Image Translation With Adversarial Preferences","authors":"Yinghua Yao;Yuangang Pan;Ivor W. Tsang;Xin Yao","doi":"10.1109/TAI.2024.3497915","DOIUrl":null,"url":null,"abstract":"Most previous methods for continuous image-to-image translation resorted to binary attributes with restrictive description ability and thus cannot achieve satisfactory performance. Some works proposed to use fine-grained semantic information, <italic>relative attributes (RAs), preferences over pairs of images on the strength of a specified attribute</i>. However, they still failed to reconcile both goals for smooth translation and for high-quality generation simultaneously. In this work, we propose a new model continuous translation via adversarial preferences (CTAP) to coordinate these two goals for high-quality continuous translation based on RAs. In CTAP, we simultaneously train two modules: a generator that translates an input image to the desired image with smooth nuanced changes w.r.t. the interested attributes; and a ranker that executes adversarial preferences consisting of the input image and the desired image. Particularly, adversarial preferences involve an adversarial ranking process: 1) the ranker thinks no difference between the desired image and the input image in terms of the interested attributes; 2) the generator fools the ranker to believe the attributes of its output image changes as expect compared with the input image. RAs over pairs of real images are introduced to guide the ranker to rank image pairs regarding the interested attributes only. With an effective ranker, the generator would “win” the adversarial game by producing high-quality images that present smooth changes. The experiments on two face datasets and one shoe dataset demonstrate that our CTAP achieves state-of-art results in generating high-fidelity images which exhibit smooth changes over the interested attributes.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 4","pages":"816-828"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10752923/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Most previous methods for continuous image-to-image translation resorted to binary attributes with restrictive description ability and thus cannot achieve satisfactory performance. Some works proposed to use fine-grained semantic information, relative attributes (RAs), preferences over pairs of images on the strength of a specified attribute. However, they still failed to reconcile both goals for smooth translation and for high-quality generation simultaneously. In this work, we propose a new model continuous translation via adversarial preferences (CTAP) to coordinate these two goals for high-quality continuous translation based on RAs. In CTAP, we simultaneously train two modules: a generator that translates an input image to the desired image with smooth nuanced changes w.r.t. the interested attributes; and a ranker that executes adversarial preferences consisting of the input image and the desired image. Particularly, adversarial preferences involve an adversarial ranking process: 1) the ranker thinks no difference between the desired image and the input image in terms of the interested attributes; 2) the generator fools the ranker to believe the attributes of its output image changes as expect compared with the input image. RAs over pairs of real images are introduced to guide the ranker to rank image pairs regarding the interested attributes only. With an effective ranker, the generator would “win” the adversarial game by producing high-quality images that present smooth changes. The experiments on two face datasets and one shoe dataset demonstrate that our CTAP achieves state-of-art results in generating high-fidelity images which exhibit smooth changes over the interested attributes.

查看原文本刊更多论文

具有细微变化的生成：具有对抗性偏好的连续图像到图像翻译

以往的连续图像转换方法大多采用描述能力有限的二值属性，无法达到令人满意的效果。一些研究提出使用细粒度语义信息、相对属性（RAs）、基于指定属性强度的图像对偏好。然而，他们仍然未能同时协调流畅翻译和高质量生成的两个目标。在这项工作中，我们提出了一种新的基于对抗偏好（CTAP）的持续翻译模型，以协调这两个目标，实现基于RAs的高质量持续翻译。在CTAP中，我们同时训练两个模块：一个生成器，它将输入图像转换为所需图像，并在感兴趣的属性上进行平滑的细微变化；以及执行由输入图像和期望图像组成的对抗性偏好的排序器。特别是，对抗性偏好涉及对抗性排序过程：1)排序者认为期望图像与输入图像在感兴趣的属性方面没有区别；2)生成器欺骗排序器，使其相信其输出图像的属性与输入图像相比发生了预期的变化。引入真实图像对上的RAs来引导排序器仅根据感兴趣的属性对图像对进行排序。有了有效的排名器，生成器将通过生成呈现平滑变化的高质量图像来“赢得”对抗性游戏。在两个人脸数据集和一个鞋子数据集上的实验表明，我们的CTAP在生成高保真图像方面取得了最先进的结果，并且在感兴趣的属性上表现出平滑的变化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on artificial intelligence

CiteScore

7.70

自引率

0.00%

发文量