{"title":"Trans-CycleGAN:基于变压器的无监督GAN的图像到图像风格转换","authors":"Shiwen Li","doi":"10.1109/cvidliccea56201.2022.9824311","DOIUrl":null,"url":null,"abstract":"The field of computer image generation is developing rapidly, and more and more personalized image-to-image style transfer software is produced. Image translation can convert two different styles of data to generate realistic pictures, which can not only meet the individual needs of users, but also meet the problem of insufficient data for a certain style of pictures. Transformers not only have always occupied an important position in the NLP field. In recent years, due to its model interpretability and strong multimodal fusion ability, it has also performed well in the field of computer vision. This paper studies the application of Transformers in the field of image-to-image style transfer. Replace the traditional CNN structure with the improved Transformer of the discriminator and generator model of CycleGAN, and a comparative experiment is carried out with the traditional CycleGAN. The test dataset uses the public datasets Maps and CelebA, and the results are comparable to those of the traditional CycleGAN. This paper shows that Transformer can perform the task of image-to-image style transfer on unsupervised GAN, which expands the application of Transformer in the CV filed, and can be used as a general architecture applied to more vision tasks in the future.","PeriodicalId":23649,"journal":{"name":"Vision","volume":"178 1","pages":"1-4"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Trans-CycleGAN: Image-to-Image Style Transfer with Transformer-based Unsupervised GAN\",\"authors\":\"Shiwen Li\",\"doi\":\"10.1109/cvidliccea56201.2022.9824311\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The field of computer image generation is developing rapidly, and more and more personalized image-to-image style transfer software is produced. Image translation can convert two different styles of data to generate realistic pictures, which can not only meet the individual needs of users, but also meet the problem of insufficient data for a certain style of pictures. Transformers not only have always occupied an important position in the NLP field. In recent years, due to its model interpretability and strong multimodal fusion ability, it has also performed well in the field of computer vision. This paper studies the application of Transformers in the field of image-to-image style transfer. Replace the traditional CNN structure with the improved Transformer of the discriminator and generator model of CycleGAN, and a comparative experiment is carried out with the traditional CycleGAN. The test dataset uses the public datasets Maps and CelebA, and the results are comparable to those of the traditional CycleGAN. This paper shows that Transformer can perform the task of image-to-image style transfer on unsupervised GAN, which expands the application of Transformer in the CV filed, and can be used as a general architecture applied to more vision tasks in the future.\",\"PeriodicalId\":23649,\"journal\":{\"name\":\"Vision\",\"volume\":\"178 1\",\"pages\":\"1-4\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/cvidliccea56201.2022.9824311\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/cvidliccea56201.2022.9824311","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Trans-CycleGAN: Image-to-Image Style Transfer with Transformer-based Unsupervised GAN
The field of computer image generation is developing rapidly, and more and more personalized image-to-image style transfer software is produced. Image translation can convert two different styles of data to generate realistic pictures, which can not only meet the individual needs of users, but also meet the problem of insufficient data for a certain style of pictures. Transformers not only have always occupied an important position in the NLP field. In recent years, due to its model interpretability and strong multimodal fusion ability, it has also performed well in the field of computer vision. This paper studies the application of Transformers in the field of image-to-image style transfer. Replace the traditional CNN structure with the improved Transformer of the discriminator and generator model of CycleGAN, and a comparative experiment is carried out with the traditional CycleGAN. The test dataset uses the public datasets Maps and CelebA, and the results are comparable to those of the traditional CycleGAN. This paper shows that Transformer can perform the task of image-to-image style transfer on unsupervised GAN, which expands the application of Transformer in the CV filed, and can be used as a general architecture applied to more vision tasks in the future.