2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)最新文献

筛选
英文 中文
Texture-aware Network for Smoke Density Estimation 烟雾密度估计的纹理感知网络
2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2022-12-13 DOI: 10.1109/VCIP56404.2022.10008826
Xue Xia, K. Zhan, Yajing Peng, Yuming Fang
{"title":"Texture-aware Network for Smoke Density Estimation","authors":"Xue Xia, K. Zhan, Yajing Peng, Yuming Fang","doi":"10.1109/VCIP56404.2022.10008826","DOIUrl":"https://doi.org/10.1109/VCIP56404.2022.10008826","url":null,"abstract":"Smoke density estimation, also termed as soft segmentation, was developed from pixel-wise smoke (hard) segmen-tation and it aims at providing transparency and segmentation confidence for each pixel. The key difference between them lies in that segmentation focuses on classifying pixels into smoke and non-smoke ones, while density estimation obtains inner transparency of smoke component rather than treat all smoke pixels as an equal value. Based on this, we propose a texture-aware network being able to capture inner transparency of smoke components rather than merely focus on general smoke distribution for pixel-wise smoke density estimation. Besides, we adapt the Squeeze-and-Excitation (SE) layer for smoke feature extraction by involving max values for robustness. In order to represent inhomogeneous smoke pixels, we proposed a simple yet efficient attention-based texture-aware module that involves both gradient and semantic information. Experimental results show that our method outperforms others in both single image density estimation or segmentation and video smoke detection.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121663119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RGBD-based Real-time Volumetric Reconstruction System: Architecture Design and Implementation 基于rgbd的实时体重建系统:体系结构设计与实现
2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2022-12-13 DOI: 10.1109/VCIP56404.2022.10008839
Kai Zhou, Shuai Guo, Jin-Song Hu, Jiong-Qi Wang, Qiuwen Wang, Li Song
{"title":"RGBD-based Real-time Volumetric Reconstruction System: Architecture Design and Implementation","authors":"Kai Zhou, Shuai Guo, Jin-Song Hu, Jiong-Qi Wang, Qiuwen Wang, Li Song","doi":"10.1109/VCIP56404.2022.10008839","DOIUrl":"https://doi.org/10.1109/VCIP56404.2022.10008839","url":null,"abstract":"With the increasing popularity of commercial depth cameras, 3D reconstruction of dynamic scenes has aroused widespread interest. Although many novel 3D applications have been unlocked, real-time performance is still a big problem. In this paper, a low-cost, real-time system: LiveRecon3D, is presented, with multiple RGB-D cameras connected to one single computer. The goal of the system is to provide an interactive frame rate for 3D content capture and rendering at a reduced cost. In the proposed system, we adopt a scalable volume structure and employ ray casting technique to extract the surface of 3D content. Based on a pipeline design, all the modules in the system run in parallel and are designed to minimize the latency to achieve an interactive frame rate of 30 FPS. At last, experimental results corresponding to implementation with three Kinect v2 cameras are presented to verify the system's effectiveness in terms of visual quality and real-time performance.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125343245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
History-parameter-based Affine Model Inheritance 基于历史参数的仿射模型继承
2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2022-12-13 DOI: 10.1109/VCIP56404.2022.10008881
Kai Zhang, Li Zhang, Z. Deng, Na Zhang, Yang Wang
{"title":"History-parameter-based Affine Model Inheritance","authors":"Kai Zhang, Li Zhang, Z. Deng, Na Zhang, Yang Wang","doi":"10.1109/VCIP56404.2022.10008881","DOIUrl":"https://doi.org/10.1109/VCIP56404.2022.10008881","url":null,"abstract":"In VVC, affine motion compensation (AMC) is a powerful coding tool to address non-translational motion, while history-based motion vector prediction (HMVP) is an efficient approach to compress motion vectors. However, HMVP was designed for translational motion vectors, without considering control point motion vectors (CPMV) for AMC. This paper presents a method of history-parameter-based affine model inheritance (HAMI), to utilize history information to represent CPMV more efficiently. With HAMI, affine parameters of previously affine-coded block are stored in a first history-parameter table (HPT). New affine-merge, affine motion vector prediction candidates and regular-merge candidates can be constructed with affine parameters fetched from the first HPT and base MVs fetched from neighbouring blocks in a “base-parameter-decoupled” way. New affine merge candidates can also be generated in a “base-parameter-coupled” way from a second HPT, which stores base MV information together with corresponding affine parameters. Besides, pair-wised affine merge candidates are generated by two existing affine merge candidates. Experimental results show that HAMI provides an average BD-rate saving about 0.34 % with a negligible change on the running time, compared with ECM-3.1 in random access configurations. HAMI has been adopted into ECM.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116696660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-Layer Feature based Multi-Granularity Visual Classification 基于跨层特征的多粒度视觉分类
2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2022-12-13 DOI: 10.1109/VCIP56404.2022.10008879
Junhan Chen, Dongliang Chang, Jiyang Xie, Ruoyi Du, Zhanyu Ma
{"title":"Cross-Layer Feature based Multi-Granularity Visual Classification","authors":"Junhan Chen, Dongliang Chang, Jiyang Xie, Ruoyi Du, Zhanyu Ma","doi":"10.1109/VCIP56404.2022.10008879","DOIUrl":"https://doi.org/10.1109/VCIP56404.2022.10008879","url":null,"abstract":"In contrast to traditional fine-grained visual clas-sification, multi-granularity visual classification is no longer limited to identifying the different sub-classes belonging to the same super-class (e.g., bird species, cars, and aircraft models). Instead, it gives a sequence of labels from coarse to fine (e.g., Passeriformes → Corvidae → Fish Crow), which is more convenient in practice. The key to solving this task is how to use the relationships between the different levels of labels to learn feature representations that contain different levels of granularity. Interestingly, the feature pyramid structure naturally implies different granularity of feature representation, with the shallow layers representing coarse-grained features and the deep layers representing fine-grained features. Therefore, in this paper, we exploit this property of the feature pyramid structure to decouple features and obtain feature representations corre-sponding to different granularities. Specifically, we use shallow features for coarse-grained classification and deep features for fine-grained classification. In addition, to enable fine-grained features to enhance the coarse-grained classification, we propose a feature reinforcement module based on the feature pyramid structure, where deep features are first upsampled and then combined with shallow features to make decisions. Experimental results on three widely used fine-grained image classification datasets such as CUB-200-2011, Stanford Cars, and FGVC-Aircraft validate the method's effectiveness. Code available at https://github.com/PRIS-CV/CGVC.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116813530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Semantic Compensation Based Dual-Stream Feature Interaction Network for Multi-oriented Scene Text Detection 基于语义补偿的多方向场景文本检测双流特征交互网络
2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2022-12-13 DOI: 10.1109/VCIP56404.2022.10008831
Siyan Wang, Sumei Li
{"title":"Semantic Compensation Based Dual-Stream Feature Interaction Network for Multi-oriented Scene Text Detection","authors":"Siyan Wang, Sumei Li","doi":"10.1109/VCIP56404.2022.10008831","DOIUrl":"https://doi.org/10.1109/VCIP56404.2022.10008831","url":null,"abstract":"Due to the various appearances of scene text instances and the disturbance of background, it is still a challenging task to design an effective and accurate text detector. To tackle this problem, in this paper we propose a novel dual-stream scene text detector considering semantic compensation and feature interaction. The detector extracts image features from two input images of different resolution, which improves its perceptive ability and contributes to detecting large and long texts. Specifically, we propose a Semantic Compensation Module (SCM) to aggregate features between the two streams, which compensates semantic information in features at each level via an attention mechanism. Moreover, we design a Feature Interaction Module (FIM) to obtain more expressive features. Experiments conducted on three benchmark datasets, ICDAR2015, MSRA-TD500 and ICDAR2017-MLT, demonstrate that our proposed method has competitive performance and strong robustness.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116118307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Autoencoder-based intra prediction with auxiliary feature 带有辅助功能的基于自动编码器的帧内预测
2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2022-12-13 DOI: 10.1109/VCIP56404.2022.10008846
Luhang Xu, Yue Yu, Haoping Yu, Dong Wang
{"title":"Autoencoder-based intra prediction with auxiliary feature","authors":"Luhang Xu, Yue Yu, Haoping Yu, Dong Wang","doi":"10.1109/VCIP56404.2022.10008846","DOIUrl":"https://doi.org/10.1109/VCIP56404.2022.10008846","url":null,"abstract":"A set of auto encoders is trained to perform intra prediction for block-based video coding. Each auto encoder consists of an encoding network and a decoding network. Both encoding network and decoding networks are jointly optimized and integrated into the state-of-the-art VVC reference software VTM-11.0 as an additional intra prediction mode. The simulation is conducted under common test conditions with all intra config-urations and the test results show 1.55%, 1.04%, and 0.99% of Y, U, V components Bjentegaard-Delta bit rate saving compared to VTM-11.0 anchor, respectively. The overall relative decoding running time of proposed autoencoder-based intra prediction mode on top of VTM-11.0 are 408% compared to VTM-11.0.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116722095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Recurrent Network with Enhanced Alignment and Attention-Guided Aggregation for Compressed Video Quality Enhancement 基于增强对齐和注意力引导聚合的循环网络用于压缩视频质量的提高
2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2022-12-13 DOI: 10.1109/VCIP56404.2022.10008807
Xiaodi Shi, Jucai Lin, Dong-Jin Jiang, Chunmei Nian, Jun Yin
{"title":"Recurrent Network with Enhanced Alignment and Attention-Guided Aggregation for Compressed Video Quality Enhancement","authors":"Xiaodi Shi, Jucai Lin, Dong-Jin Jiang, Chunmei Nian, Jun Yin","doi":"10.1109/VCIP56404.2022.10008807","DOIUrl":"https://doi.org/10.1109/VCIP56404.2022.10008807","url":null,"abstract":"Recently, various compressed video quality enhancement technologies have been proposed to overcome the visual artifacts. Most existing methods are based on optical flow or deformable alignment to explore the spatiotemporal information across frames. However, inaccurate motion estimation and training instability of deformable convolution would be detrimental to the reconstruction performance. In this paper, we design a bi-directional recurrent network equipping with enhanced deformable alignment and attention-guided aggregation to promote information flows among frames. For the alignment, a pair of scale and shift parameters are learned to modulate optical flows into new offsets for deformable convolution. Furthermore, an attention aggregation strategy oriented at preference is designed for temporal information fusion. The strategy synthesizes global information of inputs to modulate features for effective fusion. Extensive experiments have proved that the proposed method achieves great performance in terms of quantitative performance and qualitative effect.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117134591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recurrent Multi-connection Fusion Network for Single Image Deraining 单幅图像分离的递归多连接融合网络
2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2022-12-13 DOI: 10.1109/VCIP56404.2022.10008893
Yuetong Liu, Rui Zhang, Yunfeng Zhang, Yang Ning, Xunxiang Yao, Huijian Han
{"title":"Recurrent Multi-connection Fusion Network for Single Image Deraining","authors":"Yuetong Liu, Rui Zhang, Yunfeng Zhang, Yang Ning, Xunxiang Yao, Huijian Han","doi":"10.1109/VCIP56404.2022.10008893","DOIUrl":"https://doi.org/10.1109/VCIP56404.2022.10008893","url":null,"abstract":"Single image deraining is an important problem in many computer vision tasks because rain streaks can severely degrade the image quality. Recently, deep convolution neural network (CNN) based single image deraining methods have been developed with encouraging performance. However, most of these algorithms are designed by stacking convolutional layers, which encounter obstacles in learning abstract feature representation effectively and can only obtain limited features in the local region. In this paper, we propose a recurrent multi-connection fusion network (RMCFN) to remove rain streaks from single images. Specifically, the RMCFN employs two key components and multiple connections to fully utilize and transfer features. Firstly, we use a multi-scale fusion memory block (MFMB) to exploit multi-scale features and obtain long-range dependencies, which is beneficial to feed useful information to a later stage. Moreover, to efficiently capture the informative features on the transmission, we fuse the features of different levels and employ a multi-connection manner to use the information within and between stages. Finally, we develop a dual attention enhancement block (DAEB) to explore the valuable channel and spatial components and only pass further useful features. Extensive experiments verify the superiority of our method in visual effect and quantitative results compared to the state-of-the-arts.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127099942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weakly Supervised Region-Level Contrastive Learning for Efficient Object Detection 有效目标检测的弱监督区域级对比学习
2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2022-12-13 DOI: 10.1109/VCIP56404.2022.10008827
Yuang Deng, Yuhang Zhang, Wenrui Dai, Xiaopeng Zhang, H. Xiong
{"title":"Weakly Supervised Region-Level Contrastive Learning for Efficient Object Detection","authors":"Yuang Deng, Yuhang Zhang, Wenrui Dai, Xiaopeng Zhang, H. Xiong","doi":"10.1109/VCIP56404.2022.10008827","DOIUrl":"https://doi.org/10.1109/VCIP56404.2022.10008827","url":null,"abstract":"Semi-supervised learning, which assigns pseudo labels with models trained using limited labeled data, has been widely used in object detection to reduce the labeling cost. However, the provided pseudo annotations inevitably suffer noise since the initial model is not perfect. To address this issue, this paper introduces contrastive learning into semi-supervised object detection, and we claim that contrastive loss, which inherently relies on data augmentations, is much more robust than traditional softmax regression for noisy labels. To take full advantage of it in the detection task, we incorporate labels prior to contrastive loss and leverage plenty of region proposals to enhance diversity, which is crucial for contrastive learning. In this way, the model is optimized to make the region-level features with the same class be translation and scale invariant. Furthermore, we redesign the negative memory bank in contrastive learning to make the training more efficient. As far as we know, we are the first attempt that introduces contrastive learning in semi-supervised object detection. Experimental results on detection benchmarks demonstrate the superiority of our method. Notably, our method achieves 79.9% accuracy on VOC, which is 6.2% better than the supervised baseline and 0.7% improvement compared with the state-of-the-art method.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127105400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reduced Reference Quality Assessment for Point Cloud Compression 点云压缩的简化参考质量评估
2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2022-12-13 DOI: 10.1109/VCIP56404.2022.10008813
Yipeng Liu, Qi Yang, Yi Xu
{"title":"Reduced Reference Quality Assessment for Point Cloud Compression","authors":"Yipeng Liu, Qi Yang, Yi Xu","doi":"10.1109/VCIP56404.2022.10008813","DOIUrl":"https://doi.org/10.1109/VCIP56404.2022.10008813","url":null,"abstract":"In this paper, we propose a reduced reference (RR) point cloud quality assessment (PCQA) model named R-PCQA to quantify the distortions introduced by the lossy compression. Specifically, we use the attribute and geometry quantization steps of different compression methods (i.e., V-PCC, G-PCC and AVS) to infer the point cloud quality, assuming that the point clouds have no other distortions before compression. First, we analyze the compression distortion of point clouds under separate attribute compression and geometry compression to avoid their mutual masking, for which we consider 5 point clouds as references to generate a compression dataset (PCCQA) containing independent attribute compression and geometry compression samples. Then, we develop the proposed R-PCQA via fitting the relationship between the quantization steps and the perceptual quality. We evaluate the performance of R-PCQA on both the established dataset and another independent dataset. The results demonstrate that the proposed R-PCQA can exhibit reliable performance and high generalization ability.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127144779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信