用于图像生成的具有语义感知上采样的局部和全局GANs

IF 20.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2022-02-28 DOI:10.48550/arXiv.2203.00047

Hao Tang, Ling Shao, Philip H. S. Torr, N. Sebe

{"title":"用于图像生成的具有语义感知上采样的局部和全局GANs","authors":"Hao Tang, Ling Shao, Philip H. S. Torr, N. Sebe","doi":"10.48550/arXiv.2203.00047","DOIUrl":null,"url":null,"abstract":"In this paper, we address the task of semantic-guided image generation. One challenge common to most existing image-level generation methods is difficulty in generating small objects and detailed local textures. To tackle this issue, in this work we consider generating images using local context. As such, we design a local class-specific generative network using semantic maps as guidance, which separately constructs and learns subgenerators for different classes, enabling it to capture finer details. To learn more discriminative class-specific feature representations for the local generation, we also propose a novel classification module. To combine the advantages of both global image-level and local class-specific generation, a joint generation network is designed with an attention fusion module and a dual-discriminator structure embedded. Lastly, we propose a novel semantic-aware upsampling method, which has a larger receptive field and can take far-away pixels that are semantically related for feature upsampling, enabling it to better preserve semantic consistency for instances with the same semantic labels. Extensive experiments on two image generation tasks show the superior performance of the proposed method. State-of-the-art results are established by large margins on both tasks and on nine challenging public benchmarks.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":" ","pages":"1-1"},"PeriodicalIF":20.8000,"publicationDate":"2022-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Local and Global GANs with Semantic-Aware Upsampling for Image Generation\",\"authors\":\"Hao Tang, Ling Shao, Philip H. S. Torr, N. Sebe\",\"doi\":\"10.48550/arXiv.2203.00047\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we address the task of semantic-guided image generation. One challenge common to most existing image-level generation methods is difficulty in generating small objects and detailed local textures. To tackle this issue, in this work we consider generating images using local context. As such, we design a local class-specific generative network using semantic maps as guidance, which separately constructs and learns subgenerators for different classes, enabling it to capture finer details. To learn more discriminative class-specific feature representations for the local generation, we also propose a novel classification module. To combine the advantages of both global image-level and local class-specific generation, a joint generation network is designed with an attention fusion module and a dual-discriminator structure embedded. Lastly, we propose a novel semantic-aware upsampling method, which has a larger receptive field and can take far-away pixels that are semantically related for feature upsampling, enabling it to better preserve semantic consistency for instances with the same semantic labels. Extensive experiments on two image generation tasks show the superior performance of the proposed method. State-of-the-art results are established by large margins on both tasks and on nine challenging public benchmarks.\",\"PeriodicalId\":13426,\"journal\":{\"name\":\"IEEE Transactions on Pattern Analysis and Machine Intelligence\",\"volume\":\" \",\"pages\":\"1-1\"},\"PeriodicalIF\":20.8000,\"publicationDate\":\"2022-02-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Pattern Analysis and Machine Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2203.00047\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Pattern Analysis and Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.48550/arXiv.2203.00047","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 13

摘要

在本文中，我们讨论了语义引导图像生成的任务。大多数现有图像级生成方法的一个常见挑战是难以生成小对象和详细的局部纹理。为了解决这个问题，在这项工作中，我们考虑使用本地上下文生成图像。因此，我们使用语义图作为指导，设计了一个局部类特定的生成网络，该网络分别构建和学习不同类的子生成器，使其能够捕捉更精细的细节。为了学习用于局部生成的更具判别性的类特定特征表示，我们还提出了一个新的分类模块。为了结合全局图像级和局部类特定生成的优势，设计了一个嵌入注意力融合模块和双鉴别器结构的联合生成网络。最后，我们提出了一种新的语义感知上采样方法，该方法具有更大的感受野，可以带走语义相关的像素进行特征上采样，使其能够更好地保持具有相同语义标签的实例的语义一致性。在两个图像生成任务上的大量实验表明了该方法的优越性能。在这两项任务和九项具有挑战性的公共基准上，都以很大的优势取得了最先进的成果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Local and Global GANs with Semantic-Aware Upsampling for Image Generation

In this paper, we address the task of semantic-guided image generation. One challenge common to most existing image-level generation methods is difficulty in generating small objects and detailed local textures. To tackle this issue, in this work we consider generating images using local context. As such, we design a local class-specific generative network using semantic maps as guidance, which separately constructs and learns subgenerators for different classes, enabling it to capture finer details. To learn more discriminative class-specific feature representations for the local generation, we also propose a novel classification module. To combine the advantages of both global image-level and local class-specific generation, a joint generation network is designed with an attention fusion module and a dual-discriminator structure embedded. Lastly, we propose a novel semantic-aware upsampling method, which has a larger receptive field and can take far-away pixels that are semantically related for feature upsampling, enabling it to better preserve semantic consistency for instances with the same semantic labels. Extensive experiments on two image generation tasks show the superior performance of the proposed method. State-of-the-art results are established by large margins on both tasks and on nine challenging public benchmarks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Pattern Analysis and Machine Intelligence 工程技术-工程：电子与电气

CiteScore

28.40

自引率

3.00%

发文量

885

审稿时长

8.5 months

期刊介绍： The IEEE Transactions on Pattern Analysis and Machine Intelligence publishes articles on all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence, with a particular emphasis on machine learning for pattern analysis. Areas such as techniques for visual search, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition and relevant specialized hardware and/or software architectures are also covered.