2021 International Conference on Visual Communications and Image Processing (VCIP)最新文献

筛选
英文 中文
MPEG Immersive Video tools for Light Field Head Mounted Displays MPEG沉浸式视频工具的光场头戴式显示器
2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675317
Daniele Bonatto, Grégoire Hirt, Alexander Kvasov, Sarah Fachada, G. Lafruit
{"title":"MPEG Immersive Video tools for Light Field Head Mounted Displays","authors":"Daniele Bonatto, Grégoire Hirt, Alexander Kvasov, Sarah Fachada, G. Lafruit","doi":"10.1109/VCIP53242.2021.9675317","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675317","url":null,"abstract":"Light field displays project hundreds of micro-parallax views for users to perceive 3D without wearing glasses. It results in gigantic bandwidth requirements if all views would be transmitted, even using conventional video compression per view. MPEG Immersive Video (MIV) follows a smarter strategy by transmitting only key images and some metadata to synthesize all the missing views. We developed (and will demonstrate) a real-time Depth Image Based Rendering software that follows this approach for synthesizing all light field micro-parallax views from a couple of RGBD input views.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122519132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Learning in Compressed Domain for Faster Machine Vision Tasks 基于压缩域的快速机器视觉学习
2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675369
Jinming Liu, Heming Sun, J. Katto
{"title":"Learning in Compressed Domain for Faster Machine Vision Tasks","authors":"Jinming Liu, Heming Sun, J. Katto","doi":"10.1109/VCIP53242.2021.9675369","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675369","url":null,"abstract":"Learned image compression (LIC) has illustrated good ability for reconstruction quality driven tasks (e.g. PSNR, MS-SSIM) and machine vision tasks such as image understanding. However, most LIC frameworks are based on pixel domain, which requires the decoding process. In this paper, we develop a learned compressed domain framework for machine vision tasks. 1) By sending the compressed latent representation directly to the task network, the decoding computation can be eliminated to reduce the complexity. 2) By sorting the latent channels by entropy, only selective channels will be transmitted to the task network, which can reduce the bitrate. As a result, compared with the traditional pixel domain methods, we can reduce about 1/3 multiply-add operations (MACs) and 1/5 inference time while keeping the same accuracy. Moreover, proposed channel selection can contribute to at most 6.8% bitrate saving.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115869058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Evaluation Of Bitrate Ladders For Versatile Video Coder 多用途视频编码器的位率阶梯评价
2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675425
Reda Kaafarani, Médéric Blestel, Thomas Maugey, M. Ropert, A. Roumy
{"title":"Evaluation Of Bitrate Ladders For Versatile Video Coder","authors":"Reda Kaafarani, Médéric Blestel, Thomas Maugey, M. Ropert, A. Roumy","doi":"10.1109/VCIP53242.2021.9675425","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675425","url":null,"abstract":"Many video service providers take advantage of bitrate ladders in adaptive HTTP video streaming to account for different network states and user display specifications by providing bitrate/resolution pairs that best fit client's network conditions and display capabilities. These bitrate ladders, however, differ when using different codecs and thus the couples bitrate/resolution differ as well. In addition, bitrate ladders are based on previously available codecs (H.264/MPEG4-AVC, HEVC, etc.), i.e. codecs that are already in service, hence the introduction of new codecs e.g. Versatile Video Coding (VVC) requires re-analyzing these ladders. For that matter, we will analyze the evolution of the bitrate ladder when using VVC. We show how VVC impacts this ladder when compared to HEVC and H.264/AVC and in particular, that there is no need to switch to lower resolutions at the lower bitrates defined in the Call for Evidence on Transcoding for Network Distributed Video Coding (CfE).","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128403688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Multi-camera system for placing the viewer between the players of a live sports match: Blind Review 多摄像机系统,放置观众之间的球员之间的实况体育比赛:盲目审查
2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675336
{"title":"Multi-camera system for placing the viewer between the players of a live sports match: Blind Review","authors":"","doi":"10.1109/VCIP53242.2021.9675336","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675336","url":null,"abstract":"We demonstrate a new capture system that allows generation of virtual views corresponding with a virtual camera that is placed between the players on a sports field. Our depth estimation and segmentation pipeline can reduce 2K resolution views from 16 cameras to patches in a single 4K resolution texture atlas. We have created a real time, WebGL 2 based, playback application that renders an arbitrary view from the 4K atlas. The application allows a user to change viewpoint in real time. Additionally, to interpret the scene, a user can also remove objects such as a player or the ball. At the conference we will demonstrate both the automatic multi-camera conversion pipeline and the real-time rendering/object removal on a smartphone.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127288781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Kalman filter-based prediction refinement and quality enhancement for geometry-based point cloud compression 基于卡尔曼滤波的几何点云压缩预测改进与质量增强
2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675412
Lu Wang, Jianfeng Sun, Hui Yuan, R. Hamzaoui, Xiaohui Wang
{"title":"Kalman filter-based prediction refinement and quality enhancement for geometry-based point cloud compression","authors":"Lu Wang, Jianfeng Sun, Hui Yuan, R. Hamzaoui, Xiaohui Wang","doi":"10.1109/VCIP53242.2021.9675412","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675412","url":null,"abstract":"A point cloud is a set of points representing a three-dimensional (3D) object or scene. To compress a point cloud, the Motion Picture Experts Group (MPEG) geometry-based point cloud compression (G-PCC) scheme may use three attribute coding methods: region adaptive hierarchical transform (RAHT), predicting transform (PT), and lifting transform (LT). To improve the coding efficiency of PT, we propose to use a Kalman filter to refine the predicted attribute values. We also apply a Kalman filter to improve the quality of the reconstructed attribute values at the decoder side. Experimental results show that the combination of the two proposed methods can achieve an average Bjøntegaard delta bitrate of −0.48%, −5.18%, and −6.27% for the Luma, Chroma Cb, and Chroma Cr components, respectively, compared with a recent G-PCC reference software.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131834071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Attention-guided Convolutional Neural Network for Lightweight JPEG Compression Artifacts Removal 轻量级JPEG压缩伪影去除的注意引导卷积神经网络
2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675320
Gang Zhang, Haoquan Wang, Yedong Wang, Haijie Shen
{"title":"Attention-guided Convolutional Neural Network for Lightweight JPEG Compression Artifacts Removal","authors":"Gang Zhang, Haoquan Wang, Yedong Wang, Haijie Shen","doi":"10.1109/VCIP53242.2021.9675320","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675320","url":null,"abstract":"JPEG compression artifacts seriously affect the viewing experience. While previous studies mainly focused on the deep convolutional networks for compression artifacts removal, of which the model size and inference speed limit their application prospects. In order to solve the above problems, this paper proposed two methods that can improve the training performance of the compact convolution network without slowing down its inference speed. Firstly, a fully explainable attention loss is designed to guide the network for training, which is calculated by local entropy to accurately locate compression artifacts. Secondly, Fully Expanded Block (FEB) is proposed to replace the convolutional layer in compact network, which can be contracted back to a normal convolutional layer after the training process is completed. Extensive experiments demonstrate that the proposed method outperforms the existing lightweight methods in terms of performance and inference speed.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133344768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CRC-Based Multi-Error Correction of H.265 Encoded Videos in Wireless Communications 无线通信中基于crc的H.265编码视频多纠错
2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675400
Vivien Boussard, S. Coulombe, F. Coudoux, P. Corlay, Anthony Trioux
{"title":"CRC-Based Multi-Error Correction of H.265 Encoded Videos in Wireless Communications","authors":"Vivien Boussard, S. Coulombe, F. Coudoux, P. Corlay, Anthony Trioux","doi":"10.1109/VCIP53242.2021.9675400","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675400","url":null,"abstract":"This paper analyzes the benefits of extending CRC-based error correction (CRC-EC) to handle more errors in the context of error-prone wireless networks. In the literature, CRC-EC has been used to correct up to 3 binary errors per packet. We first present a theoretical analysis of the CRC-EC candidate list while increasing the number of errors considered. We then analyze the candidate list reduction resulting from subsequent checksum validation and video decoding steps. Simulations conducted on two wireless networks show that the network considered has a huge impact on CRC-EC performance. Over a Bluetooth low energy (BLE) channel with Eb/No=8 dB, an average PSNR improvement of 4.4 dB on videos is achieved when CRC-EC corrects up to 5, rather than 3 errors per packet.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132757244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Cross-Block Difference Guided Fast CU Partition for VVC Intra Coding 跨块差分引导的VVC内编码快速CU划分
2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675409
Hewei Liu, Shuyuan Zhu, Ruiqin Xiong, Guanghui Liu, B. Zeng
{"title":"Cross-Block Difference Guided Fast CU Partition for VVC Intra Coding","authors":"Hewei Liu, Shuyuan Zhu, Ruiqin Xiong, Guanghui Liu, B. Zeng","doi":"10.1109/VCIP53242.2021.9675409","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675409","url":null,"abstract":"In this paper, we propose a new fast CU partition method for VVC intra coding based on the cross-block difference. This difference is measured by the gradient and the content of sub-blocks obtained from partition and is employed to guide the skipping of unnecessary horizontal and vertical partition modes. With this guidance, a fast determination of block partitions is accordingly achieved. Compared with VVC, our proposed method can save 41.64% (on average) encoding time with only 0.97% (on average) increase of BD-rate.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129510306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Action Recognition Improved by Correlations and Attention of Subjects and Scene 基于主体和场景相关性和注意力的动作识别
2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675340
Manh-Hung Ha, O. Chen
{"title":"Action Recognition Improved by Correlations and Attention of Subjects and Scene","authors":"Manh-Hung Ha, O. Chen","doi":"10.1109/VCIP53242.2021.9675340","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675340","url":null,"abstract":"Comprehensive activity understanding of multiple subjects in a video requires subject detection, action identification, and behavior interpretation as well as the interactions among subjects and background. This work develops the action recognition of subject(s) based on the correlations and interactions of the whole scene and subject(s) by using the Deep Neural Network (DNN). The proposed DNN consists of 3D Convolutional Neural Network (CNN), Spatial Attention (SA) generation layer, mapping convolutional fused-depth layer, Transformer Encoder (TE), and two fully connected layers with late fusion for final classification. Especially, the attention mechanisms in SA and TE are implemented to find out meaningful action information on spatial and temporal domains for enhancing recognition performance, respectively. The experimental results reveal that the proposed DNN shows the superior accuracies of 97.8%, 98.4% and 85.6% in the datasets of traffic police, UCF101-24 and JHMDB-21, respectively. Therefore, our DNN is an outstanding classifier for various action recognitions involving one or multiple subjects.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131363603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Nearly Reversible Image-to-Image Translation Using Joint Inter-Frame Coding and Embedding 基于联合帧间编码和嵌入的近可逆图像到图像的转换
2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675370
Xinzhu Cao, Yuanzhi Yao, Nenghai Yu
{"title":"Nearly Reversible Image-to-Image Translation Using Joint Inter-Frame Coding and Embedding","authors":"Xinzhu Cao, Yuanzhi Yao, Nenghai Yu","doi":"10.1109/VCIP53242.2021.9675370","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675370","url":null,"abstract":"Image-to-image translation tasks which have been widely investigated with generative adversarial networks (GAN) aim to map an image from the source domain to the target domain. The translated image can be inversely mapped to the reconstructed source image. However, existing GAN-based schemes lack the ability to accomplish reversible translation. To remedy this drawback, a nearly reversible image-to-image translation scheme where the reconstructed source image is approximately distortion-free compared with the corresponding source image is proposed in this paper. The proposed scheme jointly considers inter-frame coding and embedding. Firstly, we organize the GAN-generated reconstructed source image and the source image into a pseudo video. Furthermore, the bitstream obtained by inter-frame coding is reversibly embedded in the translated image for nearly lossless source image reconstruction. Extensive experimental results and analysis demonstrate that the proposed scheme can achieve a high level of performance in image quality and security.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"343 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124234169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信