Journal of Visual Communication and Image Representation最新文献

筛选
英文 中文
A novel high-fidelity reversible data hiding scheme based on multi-classification pixel value ordering 一种基于多分类像素值排序的高保真可逆数据隐藏方案
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-05-10 DOI: 10.1016/j.jvcir.2025.104473
Chen Cui , Li Li , Jianfeng Lu , Shanqing Zhang , Chin-Chen Chang
{"title":"A novel high-fidelity reversible data hiding scheme based on multi-classification pixel value ordering","authors":"Chen Cui ,&nbsp;Li Li ,&nbsp;Jianfeng Lu ,&nbsp;Shanqing Zhang ,&nbsp;Chin-Chen Chang","doi":"10.1016/j.jvcir.2025.104473","DOIUrl":"10.1016/j.jvcir.2025.104473","url":null,"abstract":"<div><div>Pixel value ordering (PVO) is a highly effective technique that employs a pixel block partitioning and sorting for reversible data hiding (RDH). However, its embedding performance is significantly impacted by block size. To address this, an improved pixel-based PVO (IPPVO) was developed adopting a per-pixel approach and adaptive context size. Nevertheless, IPPVO only considers pixels below and to the right for prediction, neglecting other closer neighboring regions, leading to inaccurate predictions. This study presents a RDH strategy using multi-classification embedding to enhance performance. First, pixels are categorized into four classes based on parity coordinates, obtaining higher correlation prediction values using an adaptive nearest neighbor content size. Second, a new complexity calculation method is introduced, the complexity frequency of pixel regions to better differentiate between complex and flat regions. Finally, an effective embedding ratio and index value constraint are introduced to mitigate the challenge of excessive distortion when embedding large capacities. Experimental results indicate that the proposed scheme offers superior embedding capacity with low distortion compared to state-of-the-art PVO-based RDH methods.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"110 ","pages":"Article 104473"},"PeriodicalIF":2.6,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143947094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GDPS: A general distillation architecture for end-to-end person search GDPS:用于端到端人员搜索的通用蒸馏体系结构
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-05-06 DOI: 10.1016/j.jvcir.2025.104468
Shichang Fu , Tao Lu , Jiaming Wang , Yu Gu , Jiayi Cai , Kui Jiang
{"title":"GDPS: A general distillation architecture for end-to-end person search","authors":"Shichang Fu ,&nbsp;Tao Lu ,&nbsp;Jiaming Wang ,&nbsp;Yu Gu ,&nbsp;Jiayi Cai ,&nbsp;Kui Jiang","doi":"10.1016/j.jvcir.2025.104468","DOIUrl":"10.1016/j.jvcir.2025.104468","url":null,"abstract":"<div><div>Existing knowledge distillation methods for person search tasks handle detection and re-identification (re-id) tasks separately, which may lead to feature conflicts between the two subtasks. On the one hand, by distilling only the detection task, the network will focus more on the common features of pedestrians, which may affect the performance of re-id. On the other hand, by distilling only the re-id task, the network will be more inclined to focus on the personality characteristics of pedestrians, which may harm the detection performance. To solve this problem, we propose a novel distillation method for person search tasks, treating person search as a single task and distilling different tasks in a unified framework, which is called <strong>G</strong>eneral <strong>D</strong>istillation for <strong>P</strong>erson <strong>S</strong>earch (GDPS). Specifically, we optimize the general features of detection and re-id by distilling feature-based knowledge, aiming for accurate localization of individuals. In addition, we focus on the re-id task and perform relationship-based and response-based knowledge distillation to obtain more discriminative person features. Finally, we integrate feature-based, relation-based and response-based knowledge into a general framework to achieve simultaneous distillation of two sub-tasks, which can be readily applied to various end-to-end person search methods. Extensive experiments demonstrate the effectiveness of GDPS across different one-step person search methods. Specifically, AlginPS with ResNet-50 achieves 94.1% in mAP with GDPS on the CUHK-SYSU dataset, which surpasses the baseline 93.1% by 1.0%, and is even better than the ResNet-50 DCN-based teacher model with 94.0% mAP.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"110 ","pages":"Article 104468"},"PeriodicalIF":2.6,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143922576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-complexity AV1 intra prediction algorithm 低复杂度AV1帧内预测算法
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-05-05 DOI: 10.1016/j.jvcir.2025.104464
Wanwei Huang , Xuan Xie , Yu Chen , Baotu Wang , Jian Chen , Pingping Chen
{"title":"Low-complexity AV1 intra prediction algorithm","authors":"Wanwei Huang ,&nbsp;Xuan Xie ,&nbsp;Yu Chen ,&nbsp;Baotu Wang ,&nbsp;Jian Chen ,&nbsp;Pingping Chen","doi":"10.1016/j.jvcir.2025.104464","DOIUrl":"10.1016/j.jvcir.2025.104464","url":null,"abstract":"<div><div>As a new-generation video coding standard, Alliance for Open Media Video 1 (AV1) introduces flexible and diverse block partition types to improve coding efficiency, but also increases coding complexity. To address this issue, we propose a low-complexity AV1 intra prediction algorithm using Long-edge Sparse Sampling (LSS) and Chroma Migrating from Luma (CML) for efficiently encoding video sequences. First, we develop an LSS method by selecting key reference pixels based on block partition condition to reduce computational complexity. Second, we exploit a CML algorithm which combines the angle mode of the luma component and the spatial correlations of chroma components to derive more accurate linear model parameters between the luma and chroma components. Experimental results show that LSS avoids division operations, reducing 93% of addition operations. Combined with CML, our approach saves 4.97% time and enhances coding performance compared to standard AV1, particularly improving chroma component quality.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"110 ","pages":"Article 104464"},"PeriodicalIF":2.6,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143947093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contrastive Deep Supervision Meets self-knowledge distillation 对比深度监督满足自我知识提炼
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-05-03 DOI: 10.1016/j.jvcir.2025.104470
Weiwei Zhang , Peng Liang , Jianqing Zhu , Junhuang Wang
{"title":"Contrastive Deep Supervision Meets self-knowledge distillation","authors":"Weiwei Zhang ,&nbsp;Peng Liang ,&nbsp;Jianqing Zhu ,&nbsp;Junhuang Wang","doi":"10.1016/j.jvcir.2025.104470","DOIUrl":"10.1016/j.jvcir.2025.104470","url":null,"abstract":"<div><div>Self-knowledge distillation (Self-KD) creates teacher–student pairs within the network to enhance performance. However, existing Self-KD methods focus solely on task-related knowledge, neglecting the importance of task-unrelated knowledge crucial for the intermediate layer’s learning. To address this, we propose Contrastive Deep Supervision Meets Self-Knowledge Distillation (CDSKD), a technique enabling the learning of task-unrelated knowledge to aid network training. CDSKD initially incorporates an auxiliary classifier into the neural network for Self-KD. Subsequently, an attention module is introduced before the auxiliary classifier’s feature extractor to fortify original features, facilitating extraction and classification. A projection head follows the extractor, and the auxiliary classifier is trained using contrastive loss to acquire task-unrelated knowledge, i.e., the invariance of diverse data augmentation, thereby boosting the network’s overall performance. Numerous experimental results on six datasets and eight networks have shown that CDSKD outperforms other deep supervision and Self-KD methods.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"110 ","pages":"Article 104470"},"PeriodicalIF":2.6,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143928778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MCANet: Feature pyramid network with multi-scale convolutional attention and aggregation mechanisms for semantic segmentation 基于多尺度卷积关注和聚合机制的特征金字塔网络语义分割
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-05-02 DOI: 10.1016/j.jvcir.2025.104466
Shuo Hu , Xingwang Tao , Xingmiao Zhao
{"title":"MCANet: Feature pyramid network with multi-scale convolutional attention and aggregation mechanisms for semantic segmentation","authors":"Shuo Hu ,&nbsp;Xingwang Tao ,&nbsp;Xingmiao Zhao","doi":"10.1016/j.jvcir.2025.104466","DOIUrl":"10.1016/j.jvcir.2025.104466","url":null,"abstract":"<div><div>Feature Pyramid Network (FPN) is an important structure for achieving feature fusion in semantic segmentation networks. However, most current FPN-based methods suffer from insufficient capture of cross-scale long-range information and exhibit aliasing effects during cross-scale fusion. In this paper, we propose the Multi-Scale Convolutional Attention and Aggregation Mechanisms Feature Pyramid Network (MAFPN). We first construct a Context Information Enhancement Module, which provides multi-scale global feature information for different levels through a adaptive aggregation Multi-Scale Convolutional Attention Module (AMSCAM). This approach alleviates the problem of insufficient cross-scale semantic information caused by top-down feature fusion. Furthermore, we propose a feature aggregation mechanism that promotes semantic alignment through a Lightweight Convolutional Attention Module (LFAM), thus enhancing the overall effectiveness of information fusion. Finally, we employ a lightweight self-attention mechanism to capture global long-range dependencies. MCANet is a Transformer-based encoder–decoder architecture, where the encoder adopts Uniformer and Biformer in separate configurations, and the decoder consists of MAFPN and FPN heads. When using Biformer as the encoder, MCANet achieves 49.98% mIoU on the ADE20K dataset and 80.95% and 80.45% mIoU on the Cityscapes validation and test sets, respectively. With Uniformer as the encoder, it attains 48.69% mIoU on ADE20K.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"110 ","pages":"Article 104466"},"PeriodicalIF":2.6,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143916738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep-learning-based ConvLSTM and LRCN networks for human activity recognition 基于深度学习的ConvLSTM和LRCN网络用于人类活动识别
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-04-28 DOI: 10.1016/j.jvcir.2025.104469
Muhammad Hassan Khan, Muhammad Ahtisham Javed, Muhammad Shahid Farid
{"title":"Deep-learning-based ConvLSTM and LRCN networks for human activity recognition","authors":"Muhammad Hassan Khan,&nbsp;Muhammad Ahtisham Javed,&nbsp;Muhammad Shahid Farid","doi":"10.1016/j.jvcir.2025.104469","DOIUrl":"10.1016/j.jvcir.2025.104469","url":null,"abstract":"<div><div>Human activity recognition (HAR) has received significant research attention lately due to its numerous applications in automated systems such as human-behavior assessment, visual surveillance, healthcare, and entertainment. The objective of a vision-based HAR system is to understand human behavior in video data and determine the action being performed. This paper presents two end-to-end deep networks for human activity recognition, one based on the Convolutional Long Short Term Memory (ConvLSTM) and the other based on Long-term Recurrent Convolution Network (LRCN). The ConvLSTM (Shi et al., 2015) network exploits convolutions that help to extract spatial features considering their temporal correlations (i.e., spatiotemporal prediction). The LRCN (Donahue et al., 2015) fuses the advantages of simple convolution layers and LSTM layers into a single model to adequately encode the spatiotemporal data. Usually, the CNN and LSTM models are used independently: the CNN is used to separate the spatial information from the frames in the first phase. The characteristics gathered by CNN can later be used by the LSTM model to anticipate the video’s action. Rather than building two separate networks and making the whole process computationally inexpensive, we proposed a single LRCN-based network that binds CNN and LSTM layers together into a single model. Additionally, the TimeDistributed layer was introduced in the network which plays a vital role in the encoding of action videos and achieving the highest recognition accuracy. A side contribution of the paper is the evaluation of different convolutional neural network variants including 2D-CNN, and 3D-CNN, for human action recognition. An extensive experimental evaluation of the proposed deep network is carried out on three large benchmark action datasets: UCF50, HMDB51, and UCF-101 action datasets. The results reveal the effectiveness of the proposed algorithms; particularly, our LRCN-based algorithm outperformed the current state-of-the-art, achieving the highest recognition accuracy of 97.42% on UCF50, 73.63% on HMDB51, and 95.70% UCF101 datasets.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"110 ","pages":"Article 104469"},"PeriodicalIF":2.6,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143895834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
STAD-ConvBi-LSTM: Spatio-temporal attention-based deep convolutional Bi-LSTM framework for abnormal activity recognition STAD-ConvBi-LSTM:基于时空注意的深度卷积Bi-LSTM异常活动识别框架
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-04-28 DOI: 10.1016/j.jvcir.2025.104465
Roshni Singh, Abhilasha Sharma
{"title":"STAD-ConvBi-LSTM: Spatio-temporal attention-based deep convolutional Bi-LSTM framework for abnormal activity recognition","authors":"Roshni Singh,&nbsp;Abhilasha Sharma","doi":"10.1016/j.jvcir.2025.104465","DOIUrl":"10.1016/j.jvcir.2025.104465","url":null,"abstract":"<div><div>Human Activity Recognition has become significant research in computer vision. Real-time systems analyze the actions to endlessly monitor and recognize abnormal activities, thereby enlightening public security and surveillance measures in real-world. However, implementing these frameworks is a challenging task due to miscellaneous actions, complex patterns, fluctuating viewpoints or background cluttering. Recognizing abnormality in videos still needs exclusive focus for accurate prediction and computational efficiency. To address these challenges, this work introduced an efficient novel spatial–temporal attention-based deep convolutional bidirectional long short-term memory framework. Also, proposes a dual attentional convolutional neural network that combines CNN model, bidirectional-LSTM and spatial–temporal attention mechanism to extract human-centric prominent features in video-clips. The result of extensive experimental analysis exhibits that STAD-ConvBi-LSTM outperforms the state-of-the-art methods using five challenging datasets, namely UCF50, UCF101, YouTube-Action, HMDB51, Kinetics-600 and on our Synthesized Action dataset achieving notable accuracies of 98.8%, 98.1%, 81.2%, 97.4%, 88.2% and 96.7%, respectively.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"110 ","pages":"Article 104465"},"PeriodicalIF":2.6,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143886254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multip-level additive distortion method for security improvement in palette image steganography 一种提高调色板图像隐写安全性的多级加性失真方法
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-04-25 DOI: 10.1016/j.jvcir.2025.104463
Yi Chen , Hongxia Wang , Yunhe Cui , Guowei Shen , Chun Guo , Yong Liu , Hanzhou Wu
{"title":"A multip-level additive distortion method for security improvement in palette image steganography","authors":"Yi Chen ,&nbsp;Hongxia Wang ,&nbsp;Yunhe Cui ,&nbsp;Guowei Shen ,&nbsp;Chun Guo ,&nbsp;Yong Liu ,&nbsp;Hanzhou Wu","doi":"10.1016/j.jvcir.2025.104463","DOIUrl":"10.1016/j.jvcir.2025.104463","url":null,"abstract":"<div><div>With the rapid development of the Internet and communication technology, palette images have become a preferred media for steganography. However, the security of palette image steganography faces a big problem. To address this, we propose a multiple-level additive distortion method for security improvement in palette image steganography. The proposed multiple-level additive distortion method comprises an index-level cost method and a pixel-level cost method. The index-level and the pixel-level costs by the two methods can respectively reflect the relationship changes of adjacent indices and the pixels corresponding to the adjacent indices. Meanwhile, the index-level and the pixel-level costs can also reflect the modification impact of steganography. Therefore, the proposed method can improve the security of palette image steganography. We conducted extensive experiments on three datasets to verify the security improvement. Experiment results have shown our proposed multiple-level distortion method indeed has an advantage in security when compared with four state-of-the-art methods.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"110 ","pages":"Article 104463"},"PeriodicalIF":2.6,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143883195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DFTGL: Domain Filtered and Target Guided Learning for few-shot anomaly detection DFTGL:基于域滤波和目标引导学习的小样本异常检测
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-04-25 DOI: 10.1016/j.jvcir.2025.104457
Jiajun Zhang, Yanzhi Song, Zhouwang Yang
{"title":"DFTGL: Domain Filtered and Target Guided Learning for few-shot anomaly detection","authors":"Jiajun Zhang,&nbsp;Yanzhi Song,&nbsp;Zhouwang Yang","doi":"10.1016/j.jvcir.2025.104457","DOIUrl":"10.1016/j.jvcir.2025.104457","url":null,"abstract":"<div><div>This paper addresses cross-domain challenges in few-shot anomaly detection, where utilizing various source domains leads to diminished representations and compromised detection in the target domain. To tackle this, we propose Domain Filtering and Target-Guided Learning (DFTGL). Initially, we measure domain gaps and retain source domains with smaller disparities. We introduce a limited number of target domain samples to create an intermediate domain for better feature transfer during training. Additionally, we employ category-prior-based augmentation to refine feature distribution estimation while ensuring image registration. Experimental results demonstrate significant improvements in image-level AUROC compared to the baseline: 5.1%, 5.9%, and 4.1% (2-shot, 4-shot, and 8-shot settings) on MVTec and 6.9%, 2.1%, and 2.5% on ViSA datasets. This pioneering research effectively narrows domain gaps, enabling proficient feature transfer, and holds promise for early anomaly detection in industries like product inspection and medical diagnostics.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"110 ","pages":"Article 104457"},"PeriodicalIF":2.6,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143911482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A GAN-based anti-forensics method by modifying the quantization table in JPEG header file 一种基于gan的反取证方法,通过修改JPEG头文件中的量化表
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-04-23 DOI: 10.1016/j.jvcir.2025.104462
Hao Wang , Xin Cheng , Hao Wu , Xiangyang Luo , Bin Ma , Hui Zong , Jiawei Zhang , Jinwei Wang
{"title":"A GAN-based anti-forensics method by modifying the quantization table in JPEG header file","authors":"Hao Wang ,&nbsp;Xin Cheng ,&nbsp;Hao Wu ,&nbsp;Xiangyang Luo ,&nbsp;Bin Ma ,&nbsp;Hui Zong ,&nbsp;Jiawei Zhang ,&nbsp;Jinwei Wang","doi":"10.1016/j.jvcir.2025.104462","DOIUrl":"10.1016/j.jvcir.2025.104462","url":null,"abstract":"<div><div>It is crucial to detect double JPEG compression images in digital image forensics. When detecting recompressed images, most detection methods assume that the quantization table in the JPEG header is safe. The method fails once the quantization table in the header file is tampered with. Inspired by this phenomenon, this paper proposes a double JPEG compression anti-detection method based on the generative adversarial network (GAN) by modifying the quantization table of JPEG header files. The proposed method draws on the structure of GAN to modify the quantization table by gradient descent. Also, our proposed method introduces adversarial loss to determine the direction of the modification so that the modified quantization table can be used for cheat detection methods. The proposed method achieves the aim of anti-detection and only needs to replace the original quantization table after the net training. Experiments show that the proposed method has a high anti-detection rate and generates images with high visual quality.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"110 ","pages":"Article 104462"},"PeriodicalIF":2.6,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143869306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信