2021 International Conference on Visual Communications and Image Processing (VCIP)最新文献_第6页

Multi-Dimension Aware Back Projection Network For Scene Text Detection 用于场景文本检测的多维感知反向投影网络

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675323

Yizhan Zhao, Sumei Li, Yongli Chang

引用次数: 0

DIRECT: Discrete Image Rescaling with Enhancement from Case-specific Textures 直接:离散图像缩放与增强从个案特定的纹理

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675420

Yan-An Chen, Ching-Chun Hsiao, Wen-Hsiao Peng, Ching-Chun Huang

引用次数: 1

Entropy-based Deep Product Quantization for Visual Search and Deep Feature Compression 基于熵的深度产品量化视觉搜索和深度特征压缩

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675383

Benben Niu, Ziwei Wei, Yun He

{"title":"Entropy-based Deep Product Quantization for Visual Search and Deep Feature Compression","authors":"Benben Niu, Ziwei Wei, Yun He","doi":"10.1109/VCIP53242.2021.9675383","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675383","url":null,"abstract":"With the emergence of various machine-to-machine and machine-to-human tasks with deep learning, the amount of deep feature data is increasing. Deep product quantization is widely applied in deep feature retrieval tasks and has achieved good accuracy. However, it does not focus on the compression target primarily, and its output is a fixed-length quantization index, which is not suitable for subsequent compression. In this paper, we propose an entropy-based deep product quantization algorithm for deep feature compression. Firstly, it introduces entropy into hard and soft quantization strategies, which can adapt to the codebook optimization and codeword determination operations in the training and testing processes respectively. Secondly, the loss functions related to entropy are designed to adjust the distribution of quantization index, so that it can accommodate to the subsequent entropy coding module. Experimental results carried on retrieval tasks show that the proposed method can be generally combined with deep product quantization and its extended schemes, and can achieve a better compression performance under near lossless condition.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131067395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Complex Event Recognition via Spatial-Temporal Relation Graph Reasoning 基于时空关系图推理的复杂事件识别

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675337

Hua Lin, Hongtian Zhao, Hua Yang

{"title":"Complex Event Recognition via Spatial-Temporal Relation Graph Reasoning","authors":"Hua Lin, Hongtian Zhao, Hua Yang","doi":"10.1109/VCIP53242.2021.9675337","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675337","url":null,"abstract":"Events in videos usually contain a variety of factors: objects, environments, actions, and their interaction relations, and these factors as the mid-level semantics can bridge the gap between the event categories and the video clips. In this paper, we present a novel video events recognition method that uses the graph convolution networks to represent and reason the logic relations among the inner factors. Considering that different kinds of events may focus on different factors, we especially use the transformer networks to extract the spatial-temporal features drawing upon the attention mechanism that can adaptively assign weights to concerned key factors. Although transformers generally rely more on large datasets, we show the effectiveness of applying a 2D convolution backbone before the transformers. We train and test our framework on the challenging video event recognition dataset UCF-Crime and conduct ablation studies. The experimental results show that our method achieves state-of-the-art performance, outperforming previous principal advanced models with a significant margin of recognition accuracy.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133499813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Real-time embedded hologram calculation for augmented reality glasses 增强现实眼镜的实时嵌入式全息图计算

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675435

Antonin Gilles

引用次数: 0

Underwater Image Enhancement with Multi-Scale Residual Attention Network 基于多尺度残差注意网络的水下图像增强

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675342

Yosuke Ueki, M. Ikehara

{"title":"Underwater Image Enhancement with Multi-Scale Residual Attention Network","authors":"Yosuke Ueki, M. Ikehara","doi":"10.1109/VCIP53242.2021.9675342","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675342","url":null,"abstract":"Underwater images suffer from low contrast, color distortion and visibility degradation due to the light scattering and attenuation. Over the past few years, the importance of underwater image enhancement has increased because of ocean engineering and underwater robotics. Existing underwater image enhancement methods are based on various assumptions. However, it is almost impossible to define appropriate assumptions for underwater images due to the diversity of underwater images. Therefore, they are only effective for specific types of underwater images. Recently, underwater image enhancement algorisms using CNNs and GANS have been proposed, but they are not as advanced as other image processing methods due to the lack of suitable training data sets and the complexity of the issues. To solve the problems, we propose a novel underwater image enhancement method which combines the residual feature attention block and novel combination of multi-scale and multi-patch structure. Multi-patch network extracts local features to adjust to various underwater images which are often Non-homogeneous. In addition, our network includes multi-scale network which is often effective for image restoration. Experimental results show that our proposed method outperforms the conventional method for various types of images.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134552658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhanced Cross Component Sample Adaptive Offset for AVS3 增强的跨组件样本自适应偏移AVS3

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675321

Yunrui Jian, Jiaqi Zhang, Junru Li, Suhong Wang, Shanshe Wang, Siwei Ma, Wen Gao

{"title":"Enhanced Cross Component Sample Adaptive Offset for AVS3","authors":"Yunrui Jian, Jiaqi Zhang, Junru Li, Suhong Wang, Shanshe Wang, Siwei Ma, Wen Gao","doi":"10.1109/VCIP53242.2021.9675321","DOIUrl":"https://doi.org/10.1109/VCIP53242.2021.9675321","url":null,"abstract":"Cross-component prediction has great potential for removing the redundancy of multi-components. Recently, cross-component sample adaptive offset (CCSAO) was adopted in the third generation of Audio Video coding Standard (AVS3), which utilizes the intensities of co-located luma samples to determine the offsets of chroma sample filters. However, the frame-level based offset is rough for various content, and the edge information of classified samples is ignored. In this paper, we propose an enhanced CCSAO (ECCSAO) method to further improve the coding performance. Firstly, four selectable 1-D directional patterns are added to make the mapping between luma and chroma components more effectively. Secondly, one four-layer quad-tree based structure is designed to improve the filtering flexibility of CCSAO. Experimental results show that the proposed approach achieves 1.51%, 2.33% and 2.68% BD-rate savings for All-Intra (AI), Random-Access (RA) and Low Delay B (LD) configurations compared to AVS3 reference software, respectively. A subset improvement of ECCSAO has been adopted by AVS3.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"2 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126326869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Telemoji: A video chat with automated recognition of facial expressions Telemoji:自动识别面部表情的视频聊天

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675330

Alex Kreinis, Tom Damri, Tomer Leon, Marina Litvak, Irina Rabaev

引用次数: 0

Pixel Gradient Based Zooming Method for Plenoptic Intra Prediction 基于像素梯度的全视场内预测缩放方法

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675380

Fan Jiang, Xin Jin, Kedeng Tong

引用次数: 2

Reinforcement Learning based ROI Bit Allocation for Gaming Video Coding in VVC 基于强化学习的VVC游戏视频编码ROI位分配

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI: 10.1109/VCIP53242.2021.9675345

Guangjie Ren, Zizheng Liu, Zhenzhong Chen, Shan Liu

引用次数: 4