WFormer: A Transformer-Based Soft Fusion Model for Robust Image Watermarking

IF 5.3 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Emerging Topics in Computational Intelligence Pub Date : 2024-04-19 DOI:10.1109/TETCI.2024.3386916

Ting Luo;Jun Wu;Zhouyan He;Haiyong Xu;Gangyi Jiang;Chin-Chen Chang

{"title":"WFormer: A Transformer-Based Soft Fusion Model for Robust Image Watermarking","authors":"Ting Luo;Jun Wu;Zhouyan He;Haiyong Xu;Gangyi Jiang;Chin-Chen Chang","doi":"10.1109/TETCI.2024.3386916","DOIUrl":null,"url":null,"abstract":"Most deep neural network (DNN) based image watermarking models often employ the encoder-noise-decoder structure, in which watermark is simply duplicated for expansion and then directly fused with image features to produce the encoded image. However, simple duplication will generate watermark over-redundancies, and the communication between the cover image and watermark in different domains is lacking in image feature extraction and direction fusion, which degrades the watermarking performance. To solve those drawbacks, this paper proposes a Transformer-based soft fusion model for robust image watermarking, namely WFormer. Specifically, to expand watermark effectively, a watermark preprocess module (WPM) is designed with Transformers to extract valid and expanded watermark features by computing its self-attention. Then, to replace direct fusion, a soft fusion module (SFM) is deployed to integrate Transformers into image fusion with watermark by mining their long-range correlations. Precisely, self-attention is computed to extract their own latent features, and meanwhile, cross-attention is learned for bridging their gap to embed watermark effectively. In addition, a feature enhancement module (FEM) builds communication between the cover image and watermark by capturing their cross-feature dependencies, which tunes image features in accordance with watermark features for better fusion. Experimental results show that the proposed WFormer outperforms the existing state-of-the-art watermarking models in terms of invisibility, robustness, and embedding capacity. Furthermore, ablation results prove the effectiveness of the WPM, the FEM, and the SFM.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 6","pages":"4179-4196"},"PeriodicalIF":5.3000,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10505734/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Most deep neural network (DNN) based image watermarking models often employ the encoder-noise-decoder structure, in which watermark is simply duplicated for expansion and then directly fused with image features to produce the encoded image. However, simple duplication will generate watermark over-redundancies, and the communication between the cover image and watermark in different domains is lacking in image feature extraction and direction fusion, which degrades the watermarking performance. To solve those drawbacks, this paper proposes a Transformer-based soft fusion model for robust image watermarking, namely WFormer. Specifically, to expand watermark effectively, a watermark preprocess module (WPM) is designed with Transformers to extract valid and expanded watermark features by computing its self-attention. Then, to replace direct fusion, a soft fusion module (SFM) is deployed to integrate Transformers into image fusion with watermark by mining their long-range correlations. Precisely, self-attention is computed to extract their own latent features, and meanwhile, cross-attention is learned for bridging their gap to embed watermark effectively. In addition, a feature enhancement module (FEM) builds communication between the cover image and watermark by capturing their cross-feature dependencies, which tunes image features in accordance with watermark features for better fusion. Experimental results show that the proposed WFormer outperforms the existing state-of-the-art watermarking models in terms of invisibility, robustness, and embedding capacity. Furthermore, ablation results prove the effectiveness of the WPM, the FEM, and the SFM.

查看原文本刊更多论文

WFormer：基于变换器的鲁棒图像水印软融合模型

大多数基于深度神经网络（DNN）的图像水印模型通常采用编码器-噪声-解码器结构，即简单复制水印进行扩展，然后直接与图像特征融合生成编码图像。然而，简单复制会产生水印过度冗余，而且在图像特征提取和方向融合中，不同域的封面图像和水印之间缺乏交流，从而降低了水印的性能。为了解决这些问题，本文提出了一种基于变换器的鲁棒图像水印软融合模型，即 WFormer。具体来说，为了有效地扩展水印，设计了一个水印预处理模块（WPM），通过计算其自注意力来提取有效的扩展水印特征。然后，为了取代直接融合，部署了软融合模块（SFM），通过挖掘变换器与水印的长程相关性，将变换器集成到图像融合中。准确地说，通过计算自注意力来提取它们自身的潜在特征，同时学习交叉注意力来弥合它们之间的差距，从而有效地嵌入水印。此外，特征增强模块（FEM）通过捕捉图像和水印之间的交叉特征依赖关系，在覆盖图像和水印之间建立通信，从而根据水印特征调整图像特征，实现更好的融合。实验结果表明，所提出的 WFormer 在隐蔽性、鲁棒性和嵌入能力方面都优于现有的先进水印模型。此外，消融结果证明了 WPM、FEM 和 SFM 的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Emerging Topics in Computational Intelligence Mathematics-Control and Optimization

CiteScore

10.30

自引率

7.50%

发文量

147

期刊介绍： The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys. TETCI is an electronics only publication. TETCI publishes six issues per year. Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.