{"title":"使用变压器模型的语义超分辨率","authors":"Donghyun Ku, Hanhoon Park","doi":"10.9717/kmms.2023.26.10.1251","DOIUrl":null,"url":null,"abstract":"This paper proposes an effective method to improve the performance of SwinIR, a vision Transformer-based super-resolution neural network model, by introducing a Transformer decoder with learnable category queries. The decoder allows to extract semantic information of each dataset belonging to different categories (e.g., text and face); the semantic information can improve category-specific texture reconstruction in the process of super-resolution. Experiments were conducted using decoders of different architectures to analyze the performance of the proposed method. The experimental results confirm that the use of decoder can improve the quality of super-resolution images produced by SwinIR qualitatively and quantitatively, although improvements may vary depending on the depth of the decoder and how semantic information is applied.","PeriodicalId":16316,"journal":{"name":"Journal of Korea Multimedia Society","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semantic Super-Resolution Using a Transformer Model\",\"authors\":\"Donghyun Ku, Hanhoon Park\",\"doi\":\"10.9717/kmms.2023.26.10.1251\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes an effective method to improve the performance of SwinIR, a vision Transformer-based super-resolution neural network model, by introducing a Transformer decoder with learnable category queries. The decoder allows to extract semantic information of each dataset belonging to different categories (e.g., text and face); the semantic information can improve category-specific texture reconstruction in the process of super-resolution. Experiments were conducted using decoders of different architectures to analyze the performance of the proposed method. The experimental results confirm that the use of decoder can improve the quality of super-resolution images produced by SwinIR qualitatively and quantitatively, although improvements may vary depending on the depth of the decoder and how semantic information is applied.\",\"PeriodicalId\":16316,\"journal\":{\"name\":\"Journal of Korea Multimedia Society\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Korea Multimedia Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.9717/kmms.2023.26.10.1251\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Korea Multimedia Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.9717/kmms.2023.26.10.1251","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Semantic Super-Resolution Using a Transformer Model
This paper proposes an effective method to improve the performance of SwinIR, a vision Transformer-based super-resolution neural network model, by introducing a Transformer decoder with learnable category queries. The decoder allows to extract semantic information of each dataset belonging to different categories (e.g., text and face); the semantic information can improve category-specific texture reconstruction in the process of super-resolution. Experiments were conducted using decoders of different architectures to analyze the performance of the proposed method. The experimental results confirm that the use of decoder can improve the quality of super-resolution images produced by SwinIR qualitatively and quantitatively, although improvements may vary depending on the depth of the decoder and how semantic information is applied.