2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)最新文献

筛选
英文 中文
No-Reference Stereoscopic Image Quality Assessment Based On Visual Attention Mechanism 基于视觉注意机制的无参考立体图像质量评价
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301770
Sumei Li, Ping Zhao, Yongli Chang
{"title":"No-Reference Stereoscopic Image Quality Assessment Based On Visual Attention Mechanism","authors":"Sumei Li, Ping Zhao, Yongli Chang","doi":"10.1109/VCIP49819.2020.9301770","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301770","url":null,"abstract":"In this paper, we proposed an optimized model based on the visual attention mechanism(VAM) for no-reference stereoscopic image quality assessment (SIQA). A CNN model is designed based on dual attention mechanism (DAM), which includes channel attention mechanism and spatial attention mechanism. The channel attention mechanism can give high weight to the features with large contribution to final quality, and small weight to features with low contribution. The spatial attention mechanism considers the inner region of a feature, and different areas are assigned different weights according to the importance of the region within the feature. In addition, data selection strategy is designed for CNN model. According to VAM, visual saliency is applied to guide data selection, and a certain proportion of saliency patches are employed to fine tune the network. The same operation is performed on the test set, which can remove data redundancy and improve algorithm performance. Experimental results on two public databases show that the proposed model is superior to the state-of-the-art SIQA methods. Cross-database validation shows high generalization ability and high effectiveness of our model.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114665135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Learning Redundant Sparsifying Transform based on Equi-Angular Frame 基于等角框架的学习冗余稀疏变换
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301836
Min Zhang, Yunhui Shi, Xiaoyan Sun, N. Ling, Na Qi
{"title":"Learning Redundant Sparsifying Transform based on Equi-Angular Frame","authors":"Min Zhang, Yunhui Shi, Xiaoyan Sun, N. Ling, Na Qi","doi":"10.1109/VCIP49819.2020.9301836","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301836","url":null,"abstract":"Due to the fact that sparse coding in redundant sparse dictionary learning model is NP-hard, interest has turned to the non-redundant sparsifying transform as its sparse coding is computationally cheap. However, natural images typically contain diverse textures that cannot be sparsified well by a non-redundant system. In this paper we propose a new approach for learning redundant sparsifying transform based on equi-angular frame, where the frame and its dual frame are corresponding to applying the forward and the backward transforms. The uniform mutual coherence in the sparsifying transform is enforced by the equi-angular constraint, which better sparsifies diverse textures. In addition, an efficient algorithm is proposed for learning the redundant transform. Experimental results for image representation illustrate the superiority of our proposed method over non-redundant sparsifying transforms. The image denoising results show that our proposed method achieves superior denoising performance, in terms of subjective and objective quality, compared to the K-SVD, the data-driven tight frame method, the learning based sparsifying transform and the overcomplete transform model with block cosparsity (OCTOBOS).","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117183464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
No-Reference Objective Quality Assessment Method of Display Products 展示产品无参考客观质量评价方法
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301894
Huiqing Zhang, Donghao Li, Lifang Wu, Zhifang Xia
{"title":"No-Reference Objective Quality Assessment Method of Display Products","authors":"Huiqing Zhang, Donghao Li, Lifang Wu, Zhifang Xia","doi":"10.1109/VCIP49819.2020.9301894","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301894","url":null,"abstract":"Recent years have witnessed the spread of electronic devices especially the mobile phones, which have become almost the necessities in people’s daily lives. An effective and efficient technique for blindly assessing the quality of display products is greatly helpful to improve the experiences of users, such as displaying the pictures or texts in a more comfortable manner. In this paper, we put forward a novel no-reference (NR) quality metric of display products, dubbed as NQMDP. First, we have established a new subjective photo quality database, in which 50 photos shown on three different types of display products were captured to constitute a total of 150 photos and then scored by more than 40 inexperienced observers. Second, 19 effective image features are extracted by using six different influencing factors (including complexity, contrast, sharpness, brightness, colorfulness and naturalness) on the quality of display products and then were learned with the support vector regressor (SVR) to estimate the objective quality score of each photo. Results of experiments show that our proposed method has obtained better performance than the state-of-the-art algorithms.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129749677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-time Detection and Tracking Network with Feature Sharing 特征共享的实时检测与跟踪网络
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301779
Ente Guo, Z. Chen, Zhenjia Fan, Xiujun Yang
{"title":"Real-time Detection and Tracking Network with Feature Sharing","authors":"Ente Guo, Z. Chen, Zhenjia Fan, Xiujun Yang","doi":"10.1109/VCIP49819.2020.9301779","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301779","url":null,"abstract":"Multiple object tracking (MOT) systems can benefit many applications, such as autonomous driving, action recognition, and surveillance. State-of-the-art methods detect objects in an image and then use a representation model to connect these objects with existing trajectories. However, the combination of these two components to reduce computation has received minimal attention. In this study, we propose a single-shot network for simultaneously detecting objects and extracting tracking features to achieve a real-time MOT system. We also present a detection–tracking coupled method that uses temporal information to improve the accuracy of object detection and make trajectories complete. Experimentation on the KITTI driving dataset indicates that our scheme achieves an accurate and fast MOT system. In particular, the lightweight network reaches a running speed of 100 FPS.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128440297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sparse Spectral Unmixing of Hyperspectral Images using Expectation-Propagation 基于期望传播的高光谱图像稀疏光谱分解
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301819
Zeng Li, Y. Altmann, Jie Chen, S. Mclaughlin, S. Rahardja
{"title":"Sparse Spectral Unmixing of Hyperspectral Images using Expectation-Propagation","authors":"Zeng Li, Y. Altmann, Jie Chen, S. Mclaughlin, S. Rahardja","doi":"10.1109/VCIP49819.2020.9301819","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301819","url":null,"abstract":"The aim of spectral unmixing of hyperspectral images is to determine the component materials and their associated abundances from mixed pixels. In this paper, we present sparse linear unmixing via an Expectation-Propagation method based on the classical linear mixing model and a spike-and-slab prior promoting abundance sparsity. The proposed method, which allows approximate uncertainty quantification (UQ), is compared to existing sparse unmixing methods, including Monte Carlo strategies traditionally considered for UQ. Experimental results on synthetic data and real hyperspectral data illustrate the benefits of the proposed algorithm over state-of-art linear unmixing methods.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130536249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Multi-Model Fusion Framework for NIR-to-RGB Translation nir到rgb转换的多模型融合框架
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301787
Longbin Yan, Xiuheng Wang, Min Zhao, Shumin Liu, Jie Chen
{"title":"A Multi-Model Fusion Framework for NIR-to-RGB Translation","authors":"Longbin Yan, Xiuheng Wang, Min Zhao, Shumin Liu, Jie Chen","doi":"10.1109/VCIP49819.2020.9301787","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301787","url":null,"abstract":"Near-infrared (NIR) images provide spectral information beyond the visible light spectrum and thus are useful in many applications. However, single-channel NIR images contain less information per pixel than RGB images and lack visibility for human perception. Transforming NIR images to RGB images is necessary for performing further analysis and computer vision tasks. In this work, we propose a novel NIR-to-RGB translation method. It contains two sub-networks and a fusion operator. Specifically, a U-net based neural network is used to learn the texture information while a CycleGAN based neural network is adopted to excavate the color information. Finally, a guided filter based fusion strategy is applied to fuse the outputs of these two neural networks. Experiment results show that our proposed method achieves superior NIR-to-RGB translation performance.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123959203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
FastSCCNet: Fast Mode Decision in VVC Screen Content Coding via Fully Convolutional Network 基于全卷积网络的VVC屏幕内容编码的快速模式决策
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301885
Sik-Ho Tsang, Ngai-Wing Kwong, Yui-Lam Chan
{"title":"FastSCCNet: Fast Mode Decision in VVC Screen Content Coding via Fully Convolutional Network","authors":"Sik-Ho Tsang, Ngai-Wing Kwong, Yui-Lam Chan","doi":"10.1109/VCIP49819.2020.9301885","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301885","url":null,"abstract":"Screen content coding have been supported recently in Versatile Video Coding (VVC) to improve the coding efficiency of screen content videos by adopting new coding modes which are dedicated to screen content video compression. Two new coding modes called Intra Block Copy (IBC) and Palette (PLT) are introduced. However, the flexible quad-tree plus multi-type tree (QTMT) coding structure for coding unit (CU) partitioning in VVC makes the fast algorithm of the SCC particularly challenging. To efficiently reduce the computational complexity of SCC in VVC, we propose a deep learning based fast prediction network, namely FastSCCNet, where a fully convolutional network (FCN) is designed. CUs are classified into natural content block (NCB) and screen content block (SCB). With the use of FCN, only one shot inference is needed to classify the block types of the current CU and all corresponding sub-CUs. After block classification, different subsets of coding modes are assigned according to the block type, to accelerate the encoding process. Compared with the conventional SCC in VVC, our proposed FastSCCNet reduced the encoding time by 29.88% on average, with negligible bitrate increase under all-intra configuration. To the best of our knowledge, it is the first approach to tackle the computational complexity reduction for SCC in VVC.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121505281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Point Cloud Geometry Prediction Across Spatial Scale using Deep Learning 利用深度学习跨空间尺度的点云几何预测
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301804
Anique Akhtar, Wen Gao, Xianguo Zhang, Li Li, Zhu Li, Shan Liu
{"title":"Point Cloud Geometry Prediction Across Spatial Scale using Deep Learning","authors":"Anique Akhtar, Wen Gao, Xianguo Zhang, Li Li, Zhu Li, Shan Liu","doi":"10.1109/VCIP49819.2020.9301804","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301804","url":null,"abstract":"A point cloud is a 3D data representation that is becoming increasingly popular. Due to the large size of a point cloud, the transmission of point cloud is not feasible without compression. However, the current point cloud lossy compression and processing techniques suffer from quantization loss which results in a coarser sub-sampled representation of point cloud. In this paper, we solve the problem of points lost during voxelization by performing geometry prediction across spatial scale using deep learning architecture. We perform an octree-type upsampling of point cloud geometry where each voxel point is divided into 8 sub-voxel points and their occupancy is predicted by our network. This way we obtain a denser representation of the point cloud while minimizing the losses with respect to the ground truth. We utilize sparse tensors with sparse convolutions by using Minkowski Engine with a UNet like network equipped with inception-residual network blocks. Our results show that our geometry prediction scheme can significantly improve the PSNR of a point cloud, therefore, making it an essential post-processing scheme for the compression-transmission pipeline. This solution can serve as a crucial prediction tool across scale for point cloud compression, as well as display adaptation.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124501024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Sparse Representation-Based Intra Prediction for Lossless/Near Lossless Video Coding 基于稀疏表示的无损/近无损视频编码内预测
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301752
Linwei Zhu, Yun Zhang, N. Li, Jinyong Pi, Xinju Wu
{"title":"Sparse Representation-Based Intra Prediction for Lossless/Near Lossless Video Coding","authors":"Linwei Zhu, Yun Zhang, N. Li, Jinyong Pi, Xinju Wu","doi":"10.1109/VCIP49819.2020.9301752","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301752","url":null,"abstract":"In this paper, a novel intra prediction method is presented for lossless/near lossless High Efficiency Video Coding (HEVC), termed as Sparse Representation based Intra Prediction (SRIP). In specific, the existing Angular Intra Prediction (AIP) modes in HEVC are organized as a mode dictionary, which is utilized to sparsely represent the visual signal by minimizing the difference with respect to the ground truth. For the match of encoding and decoding, the sparse coefficients are also required to be encoded and transmitted to the decoder side. To further improve the coding performance, an additional binary flag is included in the video codec to indicate which strategy is finally adopted with the rate distortion optimization, i.e., SRIP or traditional AIP. Extensive experimental results reveal that the proposed method can achieve 0.36% bit rate saving on average in case of lossless scenario.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115954369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Volumetric End-to-End Optimized Compression for Brain Images 体积端到端优化压缩脑图像
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301767
Shuo Gao, Yueyi Zhang, Dong Liu, Zhiwei Xiong
{"title":"Volumetric End-to-End Optimized Compression for Brain Images","authors":"Shuo Gao, Yueyi Zhang, Dong Liu, Zhiwei Xiong","doi":"10.1109/VCIP49819.2020.9301767","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301767","url":null,"abstract":"The amount of volumetric brain image increases rapidly, which requires a vast amount of resources for storage and transmission, so it’s urgent to explore an efficient volumetric compression method. Recent years have witnessed the progress of deep learning-based approaches for two-dimensional (2D) natural image compression, but the field of learned volumetric image compression still remains unexplored. In this paper, we propose the first end-to-end learning framework for volumetric image compression by extending the advanced techniques of 2D image compression to volumetric images. Specifically, a convolutional autoencoder is used to compress 3D image cubes, and the non-local attention models are embedded in the convolutional autoencoder to jointly capture local and global correlations. Both hyperprior and autoregressive models are used to perform the conditional probability estimation in entropy coding. To reduce model complexity, we introduce a convolutional long short-term memory network for the autoregressive model based on channel-wise prediction. Experimental results on volumetric mouse brain images show that the proposed method outperforms JPEG2000-3D, HEVC and state-of-the-art 2D methods.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132209557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信