2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)最新文献

筛选
英文 中文
Icon Colorization Based On Triple Conditional Generative Adversarial Networks 基于三重条件生成对抗网络的图标着色
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301890
Qin-Ru Han, Wenzhe Zhu, Qing Zhu
{"title":"Icon Colorization Based On Triple Conditional Generative Adversarial Networks","authors":"Qin-Ru Han, Wenzhe Zhu, Qing Zhu","doi":"10.1109/VCIP49819.2020.9301890","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301890","url":null,"abstract":"Current automatic colorization systems have many defects such as \"contour blur\", \"color overflow\"and \"color miscellaneous\", especially when they are coloring the images with hollowed-out structure. We propose a model based on triple conditional generative adversarial networks, for generator we provide contour image, colored icon and colorization mask as inputs, our network has three discriminators, structure discriminator is trained to judge if the generated icon has similar contour to the input icon, color discriminator anticipates generated icon and the input icon has the similar color style, the function of mask discriminator is to distinguish whether the output has the similar colorization area to the input mask. For the evaluation, we compared with some existing colorization models, also we made a questionnaire to obtain the evaluation of generated icons from different models. The results showed that our colorization model obtain better results comparing to the other models both in generating hollowed-out and solid structure icons.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124450605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Deep Inter Coding with Interpolated Reference Frame for Hierarchical Coding Structure 基于插值参考帧的层次编码结构深度编码
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301769
Yu Guo, Zizheng Liu, Zhenzhong Chen, Shan Liu
{"title":"Deep Inter Coding with Interpolated Reference Frame for Hierarchical Coding Structure","authors":"Yu Guo, Zizheng Liu, Zhenzhong Chen, Shan Liu","doi":"10.1109/VCIP49819.2020.9301769","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301769","url":null,"abstract":"In the hybrid video coding framework, inter prediction is an efficient tool to exploit temporal redundancy. Since the performance of inter prediction depends on the content of reference frames, coding efficiency can be significantly improved by having more effective reference frames. In this paper, we propose an enhanced inter coding scheme by generating artificial reference frames with deep neural network. Specifically, a new reference frame is interpolated from two-sided previously reconstructed frames, which can be regarded as the prediction of the to-be-coded frame. The synthesized frame is merged into reference picture list for motion estimation to further decrease the prediction residual. We integrate the proposed method into HM-16.20 under random access configuration. Experimental results show that the proposed method can significantly boost the coding performance, which provides 4.6% BD-rate reduction on average compared to HEVC baseline.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123728211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Improving Compression Artifact Reduction via End-to-End Learning of Side Information 通过端到端侧信息学习减少压缩伪影
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301805
Haichuan Ma, Dong Liu, Feng Wu
{"title":"Improving Compression Artifact Reduction via End-to-End Learning of Side Information","authors":"Haichuan Ma, Dong Liu, Feng Wu","doi":"10.1109/VCIP49819.2020.9301805","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301805","url":null,"abstract":"We propose to improve neural network-based compression artifact reduction by transmitting side information for the neural network. The side information consists of artifact descriptors that are obtained by analyzing the original and compressed images in the encoder. In the decoder, the received descriptors are used as additional input to a well-designed conditional post-processing neural network. To reduce the transmission overhead, the entire model is optimized under the rate-distortion constraint via end-to-end learning. Experimental results show that introducing the side information greatly improves the ability of the post-processing neural network, and improves the rate-distortion performance.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"142 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123190168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An Optimized Video Encoder Implementation with Screen Content Coding Tools 一个优化的视频编码器实现与屏幕内容编码工具
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301875
Xiaozhong Xu, Shitao Wang, Yu Chen, Yiming Li, Qing Zhang, Yushan Zheng, Shan Liu
{"title":"An Optimized Video Encoder Implementation with Screen Content Coding Tools","authors":"Xiaozhong Xu, Shitao Wang, Yu Chen, Yiming Li, Qing Zhang, Yushan Zheng, Shan Liu","doi":"10.1109/VCIP49819.2020.9301875","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301875","url":null,"abstract":"Screen content video applications require efficient coding of computer-generated materials. The new screen content coding tools such as intra block copy (IBC) and palette mode (PLT) have addressed this requirement. However, the added computational complexity on top of the existing sophisticated video encoders is also challenging. In this paper, we focus on the fast and efficient encoder implementation of these screen content coding tools. Improvements on hash-based IBC search, PLT optimization, mode decision between PLT and intra mode, and other general encoder accelerations towards screen content applications are studied and discussed. Experimental results show that with these methods added, the encoder can achieve some faster runtime performance than before while the compression efficiency is almost doubled with screen content coding tools.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134418050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Mixed Appearance-based and Coding Distortion-based CNN Fusion Approach for In-loop Filtering in Video Coding 基于外观和编码失真的CNN融合视频编码环内滤波方法
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301895
Jian Yue, Yanbo Gao, Shuai Li, Menghu Jia
{"title":"A Mixed Appearance-based and Coding Distortion-based CNN Fusion Approach for In-loop Filtering in Video Coding","authors":"Jian Yue, Yanbo Gao, Shuai Li, Menghu Jia","doi":"10.1109/VCIP49819.2020.9301895","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301895","url":null,"abstract":"With the success of the convolutional neural networks (CNNs) in image denoising and other computer vision tasks, CNNs have been investigated for in-loop filtering in video coding. Many existing methods directly use CNNs as powerful tools for filtering without much analysis on its effect. Considering the in-loop filters process the reconstructed video frames produced from a fixed line of video coding operations, the coding distortion in the reconstructed frames may share similar properties that can be learned by CNNs in addition to being a noisy image. Therefore, in this paper, we first categorize the CNN based filtering into two types of processes: appearance-based CNN filtering and coding distortion-based CNN filtering, and develop a two-stream CNN fusion framework accordingly. In the appearance-based CNN filtering, a CNN processes the reconstructed frame as a distorted image and extracts the global appearance information to restore the original image. In order to extract the global information, a CNN with pooling is used first to increase the receptive field and up-sampling is added in the late stage to produce pixel-level frame information. On the contrary, in the coding distortion-based filtering, a CNN processes the reconstructed frame as blocks with certain types of distortions by focusing on the local information to learn the coding distortion resulted by the fixed video coding pipeline. Finally, the appearance-based filtering stream and the coding distortion-based filtering stream are fused together to combine the two aspects of CNN filtering, and also the global and local information. To further reduce the complexity, the similar initial and last convolutional layers are shared over two streams to generate a mixed CNN. Experiments demonstrate that the proposed method achieves better performance than the existing CNN-based filtering methods, with 11.26% BD-rate saving under the All Intra configuration.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"394 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113997253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
APL: Adaptive Preloading of Short Video with Lyapunov Optimization 基于Lyapunov优化的短视频自适应预加载
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301886
Haodan Zhang, Yixuan Ban, Xinggong Zhang, Zongming Guo, Zhimin Xu, Shengbin Meng, Junlin Li, Yue Wang
{"title":"APL: Adaptive Preloading of Short Video with Lyapunov Optimization","authors":"Haodan Zhang, Yixuan Ban, Xinggong Zhang, Zongming Guo, Zhimin Xu, Shengbin Meng, Junlin Li, Yue Wang","doi":"10.1109/VCIP49819.2020.9301886","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301886","url":null,"abstract":"Short video applications, like TikTok, have attracted many users across the world. It can feed short videos based on users' preferences and allow users to slide the boring content anywhere and anytime. To reduce the loading time and keep playback smoothness, most of the short video apps will preload the recommended short videos in advance. However, these apps preload short videos in fixed size and fixed order, which can lead to huge playback stall and huge bandwidth waste. To deal with these problems, we present an Adaptive Preloading mechanism for short videos based on Lyapunov Optimization, also called APL, to achieve near-optimal playback experience, i.e., maximizing playback smoothness and minimizing bandwidth waste considering users' sliding behaviors. Specifically, we make three technical contributions: (1) We design a novel short video streaming framework which can dynamically preload the recommended short videos before the current video is downloaded completely. (2) We formulate the preloading problem into a playback experience optimization problem to maximize the playback smoothness and minimize the bandwidth waste. (3) We transform the playback experience optimization problem during the whole viewing process into a single-step greedy algorithm based on the Lyapunov optimization theory to make the online decisions during playback. Through extensive experiments based on the real datasets that generously provided by TikTok, we demonstrate that APL can reduce the stall ratio by 81%/12% and bandwidth waste by 11%/31% compared with no-preloading/fixed-preloading mechanism.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114011376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A Hybrid Model for Natural Face De-Identiation with Adjustable Privacy 一种具有可调隐私的自然人脸去识别混合模型
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301866
Yunqian Wen, Bo Liu, Rong Xie, Yunhui Zhu, Jingyi Cao, Li Song
{"title":"A Hybrid Model for Natural Face De-Identiation with Adjustable Privacy","authors":"Yunqian Wen, Bo Liu, Rong Xie, Yunhui Zhu, Jingyi Cao, Li Song","doi":"10.1109/VCIP49819.2020.9301866","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301866","url":null,"abstract":"As more and more personal photos are shared and tagged in social media, security and privacy protection are becoming an unprecedentedly focus of attention. Avoiding privacy risks such as unintended verification, becomes increasingly challenging. To enable people to enjoy uploading photos without having to consider these privacy concerns, it is crucial to study techniques that allow individuals to limit the identity information leaked in visual data. In this paper, we propose a novel hybrid model consists of two stages to generate visually pleasing de-identified face images according to a single input. Meanwhile, we successfully preserve visual similarity with the original face to retain data usability. Our approach combines latest advances in GAN-based face generation with well-designed adjustable randomness. In our experiments we show visually pleasing de-identified output of our method while preserving a high similarity to the original image content. Moreover, our method adapts well to the verificator of unknown structure, which further improves the practical value in our real life.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124944552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Quality of Experience Evaluation for Streaming Video Using CGNN 基于CGNN的流媒体视频体验质量评价
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301799
Zhiming Zhou, Yu Dong, Li Song, Rong Xie, Lin Li, Bing Zhou
{"title":"Quality of Experience Evaluation for Streaming Video Using CGNN","authors":"Zhiming Zhou, Yu Dong, Li Song, Rong Xie, Lin Li, Bing Zhou","doi":"10.1109/VCIP49819.2020.9301799","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301799","url":null,"abstract":"One of the principal contradictions these days in the field of video i s lying between the booming demand for evaluating the streaming video quality and the low precision of the Quality of Experience prediction results. In this paper, we propose Convolutional Neural Network and Gate Recurrent Unit (CGNN)-QoE, a deep learning QoE model, that can predict overall and continuous scores of video streaming services accurately in real time. We further implement state-of-the-art models on the basis of their works and compare with our method on six public available datasets. In all considered scenarios, the CGNN-QoE outperforms existing methods.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127904174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Application of Brain-Computer Interface and Virtual Reality in Advancing Cultural Experience 脑机接口与虚拟现实在提升文化体验中的应用
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301801
Hao-Lun Fu, Po-Hsiang Fang, Chan-Yu Chi, Chung-ting Kuo, Meng-Hsuan Liu, Howard Muchen Hsu, Cheng-Hsun Hsieh, Sheng-Fu Liang, S. Hsieh, Cheng-Ta Yang
{"title":"Application of Brain-Computer Interface and Virtual Reality in Advancing Cultural Experience","authors":"Hao-Lun Fu, Po-Hsiang Fang, Chan-Yu Chi, Chung-ting Kuo, Meng-Hsuan Liu, Howard Muchen Hsu, Cheng-Hsun Hsieh, Sheng-Fu Liang, S. Hsieh, Cheng-Ta Yang","doi":"10.1109/VCIP49819.2020.9301801","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301801","url":null,"abstract":"Virtual reality (VR), a computer-generated interactive environment, is provided to a user by projecting a peripheral image onto environmental surfaces. VR has an advantage of enhancing the immersive experience. Nowadays, VR has been widely applied in tourism and cultural experience. On the other hand, a recent integration of electroencephalography-based (EEG-based) brain-computer interface (BCI) and VR is capable of promoting the immersive virtual experience. Therefore, our study aims to propose an integrative framework to implement EEG-based BCI in a VR game to advance the cultural experience. A room escape game in a Tainan temple is created. EEG signals arc recorded while users arc playing the game. The online analyses of EEG signals arc used to interact with the VR display. This integrative framework can result in a better experience than the conventional setup.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126261817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The Hough-Based Multibeamlet Transform 基于霍夫的多波束变换
2020 IEEE International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2020-12-01 DOI: 10.1109/VCIP49819.2020.9301812
A. Lisowska
{"title":"The Hough-Based Multibeamlet Transform","authors":"A. Lisowska","doi":"10.1109/VCIP49819.2020.9301812","DOIUrl":"https://doi.org/10.1109/VCIP49819.2020.9301812","url":null,"abstract":"There are plenty of geometrical multiresolution transforms devoted to efficient edge representation. However, they have two drawbacks. The first one is that such transforms represent mono edge models. And the second one is that they are often based on approximations which are optimal according to the Mean Square Error what does not necessarily lead to optimal edge approximation. In this paper the multibeamlet transform based on the Hough transform is proposed. This transform is defined to properly detect multiedges present in images. Next, the method of image approximation with the use of the multibeamlet transform is described. Additionally, the modified bottom-up tree pruning algorithm is presented in order to properly approximate images with the use of multibeamlets. As follows from the performed experiments, this approach leads to image approximations with better quality than the state-of-the-art geometrical multiresolution transforms.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129055012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信