{"title":"基于注意机制和多尺度样本网络的非配对图像到图像翻译的对比学习","authors":"Yunhao Liu, Songyi Zhong, Zhenglin Li, Yangqiaoyu Zhou","doi":"10.1109/ISCC58397.2023.10218053","DOIUrl":null,"url":null,"abstract":"The aim of unpaired image translation is to learn how to transform images from a source to a target domain, while preserving as many domain-invariant features as possible. Previous methods have not been able to separate foreground and background well, resulting in texture being added to the background. Moreover, these methods often fail to distinguish different objects or different parts of the same object. In this paper, we propose an attention-based generator (AG) that can redistribute the weights of visual features, significantly enhancing the network's performance in separating foreground and background. We also embed a multi-scale multilayer perceptron (MSMLP) into the framework to capture features across a broader range of scales, which improves the discrimination of various parts of objects. Our method outperforms existing methods on various datasets in terms of Fréchet inception distance. We further analyze the impact of different modules in our approach through subsequent ablation experiments.","PeriodicalId":265337,"journal":{"name":"2023 IEEE Symposium on Computers and Communications (ISCC)","volume":"44 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Contrastive Learning with Attention Mechanism and Multi-Scale Sample Network for Unpaired Image-to-Image Translation\",\"authors\":\"Yunhao Liu, Songyi Zhong, Zhenglin Li, Yangqiaoyu Zhou\",\"doi\":\"10.1109/ISCC58397.2023.10218053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The aim of unpaired image translation is to learn how to transform images from a source to a target domain, while preserving as many domain-invariant features as possible. Previous methods have not been able to separate foreground and background well, resulting in texture being added to the background. Moreover, these methods often fail to distinguish different objects or different parts of the same object. In this paper, we propose an attention-based generator (AG) that can redistribute the weights of visual features, significantly enhancing the network's performance in separating foreground and background. We also embed a multi-scale multilayer perceptron (MSMLP) into the framework to capture features across a broader range of scales, which improves the discrimination of various parts of objects. Our method outperforms existing methods on various datasets in terms of Fréchet inception distance. We further analyze the impact of different modules in our approach through subsequent ablation experiments.\",\"PeriodicalId\":265337,\"journal\":{\"name\":\"2023 IEEE Symposium on Computers and Communications (ISCC)\",\"volume\":\"44 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE Symposium on Computers and Communications (ISCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCC58397.2023.10218053\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Symposium on Computers and Communications (ISCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCC58397.2023.10218053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Contrastive Learning with Attention Mechanism and Multi-Scale Sample Network for Unpaired Image-to-Image Translation
The aim of unpaired image translation is to learn how to transform images from a source to a target domain, while preserving as many domain-invariant features as possible. Previous methods have not been able to separate foreground and background well, resulting in texture being added to the background. Moreover, these methods often fail to distinguish different objects or different parts of the same object. In this paper, we propose an attention-based generator (AG) that can redistribute the weights of visual features, significantly enhancing the network's performance in separating foreground and background. We also embed a multi-scale multilayer perceptron (MSMLP) into the framework to capture features across a broader range of scales, which improves the discrimination of various parts of objects. Our method outperforms existing methods on various datasets in terms of Fréchet inception distance. We further analyze the impact of different modules in our approach through subsequent ablation experiments.