2019 IEEE Visual Communications and Image Processing (VCIP)最新文献

筛选
英文 中文
Bit Allocation based on Visual Saliency in HEVC HEVC中基于视觉显著性的位分配
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965753
ChungWen Ku, Guoqing Xiang, Feng Qi, Wei Yan, Yuan Li, Xiaodong Xie
{"title":"Bit Allocation based on Visual Saliency in HEVC","authors":"ChungWen Ku, Guoqing Xiang, Feng Qi, Wei Yan, Yuan Li, Xiaodong Xie","doi":"10.1109/VCIP47243.2019.8965753","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965753","url":null,"abstract":"As one of the important part in the HEVC reference software, R-lambda model adopts mean absolute difference (MAD) for the coding unit tree (CTU) level bit allocation. However, this optimum method may neglect some important characteristics of human visual system (HVS). In this paper, we propose a novel bit allocation algorithm to process some salient visual information priority. Firstly, an improved video saliency detection algorithm is proposed, which induces temporal correlation into a 2D visual attention model. Secondly, the visual saliency based CTU level bit allocation algorithm is presented by allocating bits for CTUs with their saliency weights. What’s more, with considerations of the temporal quality consistence among Saliency Areas (SAs), a window based weight smoothing model is proposed to achieve better subjective quality. Finally, several experiments are performed on the HEVC reference software, HM16.9, under the low delay P configuration, and the experimental results show that the average BD-Rate of the entire test sequences and of the SAs reduce 1.7% and 6.2%, respectively. The proposed algorithm can also improve subjective quality remarkably.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124693282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A QoE-oriented Saliency-aware Approach for 360-degree Video Transmission 面向qos的360度视频传输显著性感知方法
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965847
Wang Shen, Lianghui Ding, Guangtao Zhai, Ying Cui, Zhiyong Gao
{"title":"A QoE-oriented Saliency-aware Approach for 360-degree Video Transmission","authors":"Wang Shen, Lianghui Ding, Guangtao Zhai, Ying Cui, Zhiyong Gao","doi":"10.1109/VCIP47243.2019.8965847","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965847","url":null,"abstract":"The tradeoff between bandwidth efficiency and quality of experience (QoE) is a key issue in 360 video transmission. In this paper, we propose a QoE-oriented saliency-aware 360 video transmission framework to balance this tradeoff. The target is to reduce the bandwidth demand without declining the QoE. Specifically, the proposed model is based on the decision-making process. We use Lyapunov optimization to solve the decisionmaking problem. Furthermore, we integrate saliency information into the model to influence the decision policy, so that the model has the advantage of bandwidth efficiency. The simulation results show that the tradeoff parameter of Lyapunov optimization can balance the tradeoff between QoE and bandwidth efficiency, and 360 video saliency entropy limits the upper and lower bounds of QoE and bandwidth efficiency.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125189925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Reference Picture Synthesis for Video Sequences Captured with a Monocular Moving Camera 参考图片合成与单目移动摄像机捕获的视频序列
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965883
H. Golestani, Christian Rohlfing, J. Ohm
{"title":"Reference Picture Synthesis for Video Sequences Captured with a Monocular Moving Camera","authors":"H. Golestani, Christian Rohlfing, J. Ohm","doi":"10.1109/VCIP47243.2019.8965883","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965883","url":null,"abstract":"Inter-frame prediction plays an important role in video coding by predicting the current frame from previously encoded pictures, called reference pictures. In the case of camera motion, the content of a current frame could be very different from its reference pictures and may consequently lead to a more difficult Motion Compensation (MC). The main idea of this paper is to process the input 2D video sequence in order to estimate the 3D geometry of the scene and then employ this data to virtually synthesize \"geometrically compensated\" reference pictures. Since these virtual reference pictures are more similar to the current frame, motion estimation and consequently coding efficiency could be enhanced. The proposed method is tested over six different video sequences and around 11% bitrate reduction is achieved compared to the High Efficiency Video Coding (HEVC) standard.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126209659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
PPML: Metric Learning with Prior Probability for Video Object Segmentation 基于先验概率的度量学习用于视频对象分割
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965961
Hangshi Zhong, Zhentao Tan, Bin Liu, Weihai Li, Nenghai Yu
{"title":"PPML: Metric Learning with Prior Probability for Video Object Segmentation","authors":"Hangshi Zhong, Zhentao Tan, Bin Liu, Weihai Li, Nenghai Yu","doi":"10.1109/VCIP47243.2019.8965961","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965961","url":null,"abstract":"Video object segmentation plays an important role in computer vision and has attracted much attention. Although many recent works have removed the fine-tuning process in pursuit of fast inference speed, while achieving high segmentation accuracy, they are still far from being real-time. In this paper, we regard this task as a feature matching problem and propose a prior probability based metric learning (PPML) method for faster inference speed and higher segmentation accuracy. The proposed method consists of two ingredients: a novel template space updating strategy that improves the efficiency of segmentation by avoiding the explosion of data in template space, and a novel feature matching method which applies more potential probability information through integrating the prior of the first frame and the predicted score of previous frames. Experimental results on DAVIS datasets demonstrate that the proposed method reaches the state-of-the-art competitive performance and is more efficient in time consumption.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117025424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Progressive Semantic Image Synthesis via Generative Adversarial Network 基于生成对抗网络的渐进式语义图像合成
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8966069
Ke Yue, Yidong Li, Huifang Li
{"title":"Progressive Semantic Image Synthesis via Generative Adversarial Network","authors":"Ke Yue, Yidong Li, Huifang Li","doi":"10.1109/VCIP47243.2019.8966069","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8966069","url":null,"abstract":"Semantic image synthesis via text description is a desirable and challenging task, which requires more protection of the text irrelevant content in the original image. Existing methods directly modify the original image, which become more difficult when encountering high resolution image, and the generated images are also blurred and lack in detail. This paper presents a novel network architecture to progressively manipulate an image starting from low-resolution, while introducing the original image of corresponding size at different stages with our proposed union module to avoid losing of detail. And the progressive design of the network allows us to modify the image from coarse into fine. Compared with the previous methods, our new method can successfully manipulate a high resolution image and generate a new image with background protection and fine details. The experimental results on CUB-200-2011 dataset show that the proposed approach outperforms existing methods in terms of image detail, background protection and high resolution generation.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130819784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Image-Based End-to-End Neural Network for Dense Disparity Estimation 基于图像的端到端神经网络密集视差估计
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965761
Shuqiao Sun, Rongke Liu, Qiuchen Du, Shantong Sun, Shaoli Kang
{"title":"Image-Based End-to-End Neural Network for Dense Disparity Estimation","authors":"Shuqiao Sun, Rongke Liu, Qiuchen Du, Shantong Sun, Shaoli Kang","doi":"10.1109/VCIP47243.2019.8965761","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965761","url":null,"abstract":"Stereo matching is a challenging yet important task to various computer vision applications, e.g. 3D reconstruction, augmented reality, and autonomous vehicles. In this paper, we present a novel image-based convolutional neural network (CNN) for dense disparity estimation using stereo image pairs. In order to achieve precise and robust stereo matching, we introduce a feature extraction module that learns both local and global information. These features are then passed through an hour-glass structure to generate disparity maps from lower resolution to full resolution. We test the proposed method in several datasets including indoor scenes and synthetic scenes. Experimental results demonstrate that the proposed method outperforms the state-of-the-art methods in several datasets.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129750320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
PGAN: Prediction Generative Adversarial Nets for Meshes PGAN:网格预测生成对抗网络
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965985
Tingting Li, Yunhui Shi, Xiaoyan Sun, Jin Wang, Baocai Yin
{"title":"PGAN: Prediction Generative Adversarial Nets for Meshes","authors":"Tingting Li, Yunhui Shi, Xiaoyan Sun, Jin Wang, Baocai Yin","doi":"10.1109/VCIP47243.2019.8965985","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965985","url":null,"abstract":"Unlike images, the topology similarity among meshes can hardly be handled with traditional signal processing tools because of their irregular structures. Geometry image parameterization provides a way to represent 3D meshes in the form of 2D geometry and normal images. However, most existing methods, including the CoGAN are not suitable for such unnatural images corresponding to meshes. To solve this problem, we propose a Prediction Generative Adversarial Network (PGAN) to learn a joint distribution of geometry and normal images for generating meshes. Particularly, we enforce a prediction constraint on the geometry GAN and normal GAN in our PGAN utilizing the inherent relationship between the geometry and normal. The experimental results on face mesh generation indicate that our PGAN outperforms in generating realistic face models with rich facial attributes such as facial expression and retaining the geometry of the faces.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114565472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Predicting the visual saliency of the people with VIMS 预测VIMS患者的视觉显著性
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965925
Jiawei Yang, Guangtao Zhai, Huiyu Duan
{"title":"Predicting the visual saliency of the people with VIMS","authors":"Jiawei Yang, Guangtao Zhai, Huiyu Duan","doi":"10.1109/VCIP47243.2019.8965925","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965925","url":null,"abstract":"As is known to us, visually induced motion sickness (VIMS) is often experienced in a virtual environment. Learning the visual attention of people with VIMS contributes to related research in the field of virtual reality (VR) content design and psychology. In this paper, we first construct a saliency prediction for people with VIMS (SPPV) database, which is the first of its kind. The database consists of 80 omnidirectional images and the corresponding eye tracking data collected from 30 individuals. We analyze the performance of five state-of-the-art deep neural networks (DNN)-based saliency prediction algorithms with their original networks and the fine-tuned networks on our database. We predict the atypical visual attention of people with VIMS for the first time and obtain relatively good saliency prediction results for VIMS controls so far.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117236745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Parallax-Tolerant 360 Live Video Stitcher 视差容忍360实时视频缝制
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965900
Miko Atokari, Marko Viitanen, Alexandre Mercat, Emil Kattainen, Jarno Vanne
{"title":"Parallax-Tolerant 360 Live Video Stitcher","authors":"Miko Atokari, Marko Viitanen, Alexandre Mercat, Emil Kattainen, Jarno Vanne","doi":"10.1109/VCIP47243.2019.8965900","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965900","url":null,"abstract":"This paper presents an open-source software implementation for real-time 360-degree video stitching. To ensure a seamless stitching result, cylindrical and content-preserving warping are implemented to dynamically correct image alignment and parallax, which may drift due to scene changes, moving objects, or camera movement. Depth variation, color changes, and lighting differences between adjacent frames are also smoothed out to improve visual quality of the panoramic video. The system is benchmarked with six 1080p videos, which are stitched into 4096×732 pixel output format. The proposed algorithm attains an output rate of 18 frames per second on GeForce GTX 1070 GPU and real-time speed can be met with a high-end GPU.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123007844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
CR-U-Net: Cascaded U-Net with Residual Mapping for Liver Segmentation in CT Images* CR-U-Net:基于残差映射的级联U-Net用于CT图像肝脏分割*
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8966072
Yiwei Liu, Na Qi, Qing Zhu, Weiran Li
{"title":"CR-U-Net: Cascaded U-Net with Residual Mapping for Liver Segmentation in CT Images*","authors":"Yiwei Liu, Na Qi, Qing Zhu, Weiran Li","doi":"10.1109/VCIP47243.2019.8966072","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8966072","url":null,"abstract":"Abdominal computed tomography (CT) is a common modality to detect liver lesions. Liver segmentation in CT scan is important for diagnosis and analysis of liver lesions. However, the accuracy of existing liver segmentation methods is slightly insufficient. In this paper, we propose a liver segmentation architecture named CR-U-Net, which is composed of cascade U-Net combined with residual mapping. We make use of the MDice loss function for training in CR-U-Net, and the second-level of cascade network is deeper than the first-level to extract more detailed image features. Morphological algorithms are utilized as an intermediate-processing step to improve the segmentation accuracy. In addition, we evaluate our proposed CR-U-Net on liver segmentation task under the dataset provided by the 2017 ISBI LiTS Challenge. The experimental result demonstrates that our proposed CR-U-Net can outperform the state-of-the-art methods in term of the performance measures, such as Dice score, VOE, and so on.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129605381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信