Xiaodi Shi, Jucai Lin, Dong-Jin Jiang, Chunmei Nian, Jun Yin
{"title":"基于增强对齐和注意力引导聚合的循环网络用于压缩视频质量的提高","authors":"Xiaodi Shi, Jucai Lin, Dong-Jin Jiang, Chunmei Nian, Jun Yin","doi":"10.1109/VCIP56404.2022.10008807","DOIUrl":null,"url":null,"abstract":"Recently, various compressed video quality enhancement technologies have been proposed to overcome the visual artifacts. Most existing methods are based on optical flow or deformable alignment to explore the spatiotemporal information across frames. However, inaccurate motion estimation and training instability of deformable convolution would be detrimental to the reconstruction performance. In this paper, we design a bi-directional recurrent network equipping with enhanced deformable alignment and attention-guided aggregation to promote information flows among frames. For the alignment, a pair of scale and shift parameters are learned to modulate optical flows into new offsets for deformable convolution. Furthermore, an attention aggregation strategy oriented at preference is designed for temporal information fusion. The strategy synthesizes global information of inputs to modulate features for effective fusion. Extensive experiments have proved that the proposed method achieves great performance in terms of quantitative performance and qualitative effect.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Recurrent Network with Enhanced Alignment and Attention-Guided Aggregation for Compressed Video Quality Enhancement\",\"authors\":\"Xiaodi Shi, Jucai Lin, Dong-Jin Jiang, Chunmei Nian, Jun Yin\",\"doi\":\"10.1109/VCIP56404.2022.10008807\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, various compressed video quality enhancement technologies have been proposed to overcome the visual artifacts. Most existing methods are based on optical flow or deformable alignment to explore the spatiotemporal information across frames. However, inaccurate motion estimation and training instability of deformable convolution would be detrimental to the reconstruction performance. In this paper, we design a bi-directional recurrent network equipping with enhanced deformable alignment and attention-guided aggregation to promote information flows among frames. For the alignment, a pair of scale and shift parameters are learned to modulate optical flows into new offsets for deformable convolution. Furthermore, an attention aggregation strategy oriented at preference is designed for temporal information fusion. The strategy synthesizes global information of inputs to modulate features for effective fusion. Extensive experiments have proved that the proposed method achieves great performance in terms of quantitative performance and qualitative effect.\",\"PeriodicalId\":269379,\"journal\":{\"name\":\"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VCIP56404.2022.10008807\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP56404.2022.10008807","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Recurrent Network with Enhanced Alignment and Attention-Guided Aggregation for Compressed Video Quality Enhancement
Recently, various compressed video quality enhancement technologies have been proposed to overcome the visual artifacts. Most existing methods are based on optical flow or deformable alignment to explore the spatiotemporal information across frames. However, inaccurate motion estimation and training instability of deformable convolution would be detrimental to the reconstruction performance. In this paper, we design a bi-directional recurrent network equipping with enhanced deformable alignment and attention-guided aggregation to promote information flows among frames. For the alignment, a pair of scale and shift parameters are learned to modulate optical flows into new offsets for deformable convolution. Furthermore, an attention aggregation strategy oriented at preference is designed for temporal information fusion. The strategy synthesizes global information of inputs to modulate features for effective fusion. Extensive experiments have proved that the proposed method achieves great performance in terms of quantitative performance and qualitative effect.