TransCNN-HAE: Transformer-CNN Hybrid AutoEncoder for Blind Image Inpainting

Proceedings of the 30th ACM International Conference on Multimedia Pub Date : 2022-10-10 DOI:10.1145/3503161.3547848

Haoru Zhao, Zhaorui Gu, Bing Zheng, Haiyong Zheng

{"title":"TransCNN-HAE: Transformer-CNN Hybrid AutoEncoder for Blind Image Inpainting","authors":"Haoru Zhao, Zhaorui Gu, Bing Zheng, Haiyong Zheng","doi":"10.1145/3503161.3547848","DOIUrl":null,"url":null,"abstract":"Blind image inpainting is extremely challenging due to the unknown and multi-property complexity of contamination in different contaminated images. Current mainstream work decomposes blind image inpainting into two stages: mask estimating from the contaminated image and image inpainting based on the estimated mask, and this two-stage solution involves two CNN-based encoder-decoder architectures for estimating and inpainting separately. In this work, we propose a novel one-stage Transformer-CNN Hybrid AutoEncoder (TransCNN-HAE) for blind image inpainting, which intuitively follows the inpainting-then-reconstructing pipeline by leveraging global long-range contextual modeling of Transformer to repair contaminated regions and local short-range contextual modeling of CNN to reconstruct the repaired image. Moreover, a Cross-layer Dissimilarity Prompt (CDP) is devised to accelerate the identifying and inpainting of contaminated regions. Ablation studies validate the efficacy of both TransCNN-HAE and CDP, and extensive experiments on various datasets with multi-property contaminations show that our method achieves state-of-the-art performance with much lower computational cost on blind image inpainting. Our code is available at https://github.com/zhenglab/TransCNN-HAE.","PeriodicalId":412792,"journal":{"name":"Proceedings of the 30th ACM International Conference on Multimedia","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM International Conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3503161.3547848","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Blind image inpainting is extremely challenging due to the unknown and multi-property complexity of contamination in different contaminated images. Current mainstream work decomposes blind image inpainting into two stages: mask estimating from the contaminated image and image inpainting based on the estimated mask, and this two-stage solution involves two CNN-based encoder-decoder architectures for estimating and inpainting separately. In this work, we propose a novel one-stage Transformer-CNN Hybrid AutoEncoder (TransCNN-HAE) for blind image inpainting, which intuitively follows the inpainting-then-reconstructing pipeline by leveraging global long-range contextual modeling of Transformer to repair contaminated regions and local short-range contextual modeling of CNN to reconstruct the repaired image. Moreover, a Cross-layer Dissimilarity Prompt (CDP) is devised to accelerate the identifying and inpainting of contaminated regions. Ablation studies validate the efficacy of both TransCNN-HAE and CDP, and extensive experiments on various datasets with multi-property contaminations show that our method achieves state-of-the-art performance with much lower computational cost on blind image inpainting. Our code is available at https://github.com/zhenglab/TransCNN-HAE.

查看原文本刊更多论文

transnn - hae:用于盲图像绘制的变压器- cnn混合自编码器

由于不同污染图像中污染的未知性和多属性复杂性，使得图像盲涂非常具有挑战性。目前的主流工作将盲图像补漆分解为两个阶段:从污染图像中估计掩码和基于估计掩码的图像补漆，这个两阶段的解决方案涉及两个基于cnn的编码器-解码器架构，分别用于估计和补漆。在这项工作中，我们提出了一种用于盲图像修复的新型一级变压器-CNN混合自动编码器(TransCNN-HAE)，它通过利用变压器的全局远程上下文建模来修复污染区域，利用CNN的局部短程上下文建模来重建修复后的图像，直观地遵循修复-重建的管道。此外，设计了一种跨层不相似提示(CDP)，以加快污染区域的识别和涂漆。烧蚀研究验证了TransCNN-HAE和CDP的有效性，并且在具有多属性污染的各种数据集上进行的大量实验表明，我们的方法在盲图像喷漆上实现了最先进的性能，计算成本更低。我们的代码可在https://github.com/zhenglab/TransCNN-HAE上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 30th ACM International Conference on Multimedia

自引率

0.00%

发文量