Color Image Steganalysis Based on Pixel Difference Convolution and Enhanced Transformer With Selective Pooling

IF 6.3 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

IEEE Transactions on Information Forensics and Security Pub Date : 2024-10-24 DOI:10.1109/TIFS.2024.3486027

Kangkang Wei;Weiqi Luo;Jiwu Huang

{"title":"Color Image Steganalysis Based on Pixel Difference Convolution and Enhanced Transformer With Selective Pooling","authors":"Kangkang Wei;Weiqi Luo;Jiwu Huang","doi":"10.1109/TIFS.2024.3486027","DOIUrl":null,"url":null,"abstract":"Current deep learning-based steganalyzers often depend on specific image dimensions, leading to inevitable adjustments in network structure when dealing with varied image sizes. This impedes their effectiveness in managing the wide range of image sizes commonly found on social media. To address this issue, our paper presents a novel steganalytic network that is optimized for fixed-size (notably, \n<inline-formula> <tex-math>$256\\times 256$ </tex-math></inline-formula>\n) color images, but is capable of efficiently detecting stego images of arbitrary size without needing retraining or modifications to the network. Our proposed network is comprised of four modules. In the initial stem module, we calculate truncated residuals for each color channel of the input image. Diverging from existing steganalytic networks that rely on vanilla convolution, we have developed a pixel difference convolution module designed to better capture the artifacts introduced by steganography. Following this, we introduce an enhanced Transformer module with selective pooling, aimed at more effectively extracting global steganalytic features. To guarantee our network’s adaptability to different image sizes, we have developed a selective pooling strategy. This involves using global covariance pooling for fixed-size color images and spatial pyramid pooling for color images of various other sizes. This approach effectively standardizes the feature maps into uniform feature vectors. The final module is focused on classification. Extensive testing results on the ALASKA II color image dataset have demonstrated that our approach significantly improves detection performance for both fixed-size and arbitrary-size images, achieving state-of-the-art results. Additionally, we provide numerous ablation studies to confirm the effectiveness and soundness of our proposed network architecture.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"19 ","pages":"9970-9983"},"PeriodicalIF":6.3000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10734380/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Current deep learning-based steganalyzers often depend on specific image dimensions, leading to inevitable adjustments in network structure when dealing with varied image sizes. This impedes their effectiveness in managing the wide range of image sizes commonly found on social media. To address this issue, our paper presents a novel steganalytic network that is optimized for fixed-size (notably,

$256\times 256$

) color images, but is capable of efficiently detecting stego images of arbitrary size without needing retraining or modifications to the network. Our proposed network is comprised of four modules. In the initial stem module, we calculate truncated residuals for each color channel of the input image. Diverging from existing steganalytic networks that rely on vanilla convolution, we have developed a pixel difference convolution module designed to better capture the artifacts introduced by steganography. Following this, we introduce an enhanced Transformer module with selective pooling, aimed at more effectively extracting global steganalytic features. To guarantee our network’s adaptability to different image sizes, we have developed a selective pooling strategy. This involves using global covariance pooling for fixed-size color images and spatial pyramid pooling for color images of various other sizes. This approach effectively standardizes the feature maps into uniform feature vectors. The final module is focused on classification. Extensive testing results on the ALASKA II color image dataset have demonstrated that our approach significantly improves detection performance for both fixed-size and arbitrary-size images, achieving state-of-the-art results. Additionally, we provide numerous ablation studies to confirm the effectiveness and soundness of our proposed network architecture.

查看原文本刊更多论文

基于像素差卷积和带选择性池的增强变换器的彩色图像隐写分析

目前基于深度学习的隐写分析器通常依赖于特定的图像尺寸，导致在处理不同尺寸的图像时，网络结构不可避免地要进行调整。这妨碍了它们管理社交媒体上常见的各种图像尺寸的有效性。为了解决这个问题，我们的论文提出了一种新型隐写网络，它针对固定尺寸（特别是 256times 256$）的彩色图像进行了优化，但能够高效检测任意尺寸的隐写图像，而无需对网络进行重新训练或修改。我们提出的网络由四个模块组成。在初始干模块中，我们计算输入图像每个颜色通道的截断残差。与现有的依靠虚假卷积的隐写网络不同，我们开发了一个像素差值卷积模块，旨在更好地捕捉隐写术带来的假象。在此基础上，我们引入了具有选择性池化功能的增强型变换器模块，旨在更有效地提取全局隐写特征。为了保证我们的网络能够适应不同大小的图像，我们开发了一种选择性汇集策略。这包括对固定尺寸的彩色图像使用全局协方差池，对其他各种尺寸的彩色图像使用空间金字塔池。这种方法能有效地将特征图标准化为统一的特征向量。最后一个模块的重点是分类。ALASKA II 彩色图像数据集的大量测试结果表明，我们的方法显著提高了固定尺寸和任意尺寸图像的检测性能，达到了最先进的效果。此外，我们还提供了大量的消融研究，以证实我们提出的网络架构的有效性和合理性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Information Forensics and Security 工程技术-工程：电子与电气

CiteScore

14.40

自引率

7.40%

发文量

234

审稿时长

6.5 months

期刊介绍： The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features