2021 International Conference on Visual Communications and Image Processing (VCIP)最新文献_第7页

Perceptual Evaluation of Pre-processing for Video Transcoding 视频转码预处理的感知评价

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675438

Shiyu Huang, Ziyuan Luo, Jiahua Xu, Wei Zhou, Zhibo Chen

引用次数: 0

CAESR: Conditional Autoencoder and Super-Resolution for Learned Spatial Scalability 条件自编码器和学习空间可扩展性的超分辨率

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675351

Charles Bonnineau, W. Hamidouche, J. Travers, N. Sidaty, Jean-Yves Aubié, O. Déforges

引用次数: 0

Security and Forensics Exploration of Learning-based Image Coding 基于学习的图像编码的安全性和法医学探索

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675445

Deepayan Bhowmik, Mohamed Elawady, Keiller Nogueira

引用次数: 1

Learning-Based Complexity Reduction Scheme for VVC Intra-Frame Prediction 基于学习的VVC帧内预测复杂度降低方案

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675394

Mário Saldanha, G. Sanchez, C. Marcon, L. Agostini

引用次数: 7

Faster and Finer Pose Estimation for Object Pool in a Single RGB Image 更快和更精细的姿态估计对象池在一个单一的RGB图像

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675316

Lee Aing, W. Lie, J. Chiang

引用次数: 0

HCiT: Deepfake Video Detection Using a Hybrid Model of CNN features and Vision Transformer HCiT:使用CNN特征和视觉变压器混合模型的深度假视频检测

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675402

Bachir Kaddar, Sid Ahmed Fezza, W. Hamidouche, Z. Akhtar, A. Hadid

{"title":"HCiT: Deepfake Video Detection Using a Hybrid Model of CNN features and Vision Transformer","authors":"Bachir Kaddar, Sid Ahmed Fezza, W. Hamidouche, Z. Akhtar, A. Hadid","doi":"10.1109/VCIP53242.2021.9675402","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675402","url":null,"abstract":"The number of new falsified video contents is dramatically increasing, making the need to develop effective deepfake detection methods more urgent than ever. Even though many existing deepfake detection approaches show promising results, the majority of them still suffer from a number of critical limitations. In general, poor generalization results have been obtained under unseen or new deepfake generation methods. Consequently, in this paper, we propose a deepfake detection method called HCiT, which combines Convolutional Neural Network (CNN) with Vision Transformer (ViT). The HCiT hybrid architecture exploits the advantages of CNN to extract local information with the ViT's self-attention mechanism to improve the detection accuracy. In this hybrid architecture, the feature maps extracted from the CNN are feed into ViT model that determines whether a specific video is fake or real. Experiments were performed on Faceforensics++ and DeepFake Detection Challenge preview datasets, and the results show that the proposed method significantly outperforms the state-of-the-art methods. In addition, the HCiT method shows a great capacity for generalization on datasets covering various techniques of deepfake generation. The source code is available at: https://github.com/KADDAR-Bachir/HCiT","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125272892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

LRS-Net: invisible QR Code embedding, detection, and restoration LRS-Net:隐形二维码嵌入、检测、还原

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675327

Yiyan Yang, Zhongpai Gao, Guangtao Zhai

引用次数: 0

360HRL: Hierarchical Reinforcement Learning Based Rate Adaptation for 360-Degree Video Streaming 360HRL:基于分层强化学习的360度视频流速率自适应

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675439

Jun Fu, Chen Hou, Zhibo Chen

{"title":"360HRL: Hierarchical Reinforcement Learning Based Rate Adaptation for 360-Degree Video Streaming","authors":"Jun Fu, Chen Hou, Zhibo Chen","doi":"10.1109/VCIP53242.2021.9675439","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675439","url":null,"abstract":"Recently, reinforced adaptive bitrate (ABR) algorithms have achieved remarkable success in tile-based 360-degree video streaming. However, they heavily rely on accurate viewport prediction. To alleviate this issue, we propose a hierarchical reinforcement-learning (RL) based ABR algorithm, dubbed 360HRL. Specifically, 360HRL consists of a top agent and a bottom agent. The former is used to decide whether to download a new segment for continuous playback or re-download an old segment for correcting wrong bitrate decisions caused by inaccurate viewport estimation, and the latter is used to select bitrates for tiles in the chosen segment. In addition, 360HRL adopts a two-stage training methodology. In the first stage, the bottom agent is trained under the environment where the top agent always chooses to download a new segment. In the second stage, the bottom agent is fixed and the top agent is optimized with the help of a heuristic decision rule. Experimental results demonstrate that 360HRL outperforms existing RL-based ABR algorithms across a broad of network conditions and quality of experience (QoE) objectives.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124075777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Scalable Privacy in Multi-Task Image Compression 多任务图像压缩中的可扩展隐私

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675357

Saeed Ranjbar Alvar, I. Bajić

引用次数: 5

Spatio-spectral Image Reconstruction Using Non-local Filtering 基于非局部滤波的空间光谱图像重构

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675421

Frank Sippel, Jürgen Seiler, A. Kaup

{"title":"Spatio-spectral Image Reconstruction Using Non-local Filtering","authors":"Frank Sippel, Jürgen Seiler, A. Kaup","doi":"10.1109/VCIP53242.2021.9675421","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675421","url":null,"abstract":"In many image processing tasks it occurs that pixels or blocks of pixels are missing or lost in only some channels. For example during defective transmissions of RGB images, it may happen that one or more blocks in one color channel are lost. Nearly all modern applications in image processing and transmission use at least three color channels, some of the applications employ even more bands, for example in the infrared and ultraviolet area of the light spectrum. Typically, only some pixels and blocks in a subset of color channels are distorted. Thus, other channels can be used to reconstruct the missing pixels, which is called spatio-spectral reconstruction. Current state-of-the-art methods purely rely on the local neighborhood, which works well for homogeneous regions. However, in high-frequency regions like edges or textures, these methods fail to properly model the relationship between color bands. Hence, this paper introduces non-local filtering for building a linear regression model that describes the inter-band relationship and is used to reconstruct the missing pixels. Our novel method is able to increase the PSNR on average by 2 dB and yields visually much more appealing images in high-frequency regions.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121806766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2