{"title":"A cost-efficient hardware architecture of deblocking filter in HEVC","authors":"Xin Ye, Dandan Ding, Lu Yu","doi":"10.1109/VCIP.2014.7051541","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051541","url":null,"abstract":"This paper presents a hardware architecture of deblocking filter (DBF) for High Efficiency Video Coding (HEVC) by jointly considering system throughput and hardware cost. A hybrid pipeline with two processing levels is adopted to improve system performance. With the hybrid pipeline, only one 1-D filter and single-port on-chip SRAM are used. According to the data dependence between neighbouring edges, a shifted 16×16 basic processing unit as well as corresponding filtering order is proposed. It reduces memory cost and makes the DBF friendlier to work in a coding/decoding system. The proposed hardware architecture is synthesized under 0.13um standard CMOS technology and result shows that it consumes 17.6k gates at an operating frequency of 250MHz. Consequently, the design can support real-time processing of QFHD (3840×2160) video applications at 60 fps.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131397355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tagged multi-hypothesis motion compensation scheme for video coding","authors":"Lei Chen, Ronggang Wang, Siwei Ma","doi":"10.1109/VCIP.2014.7051518","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051518","url":null,"abstract":"Accuracy of prediction block (PB) plays a very important role in improving the coding performance. In this paper, we propose tagged multi-hypothesis motion compensation scheme (TMHMC) for inter frames to improve the accuracy of PB. TMHMC not only makes use of temporal correlation between frames but also the spatial correlation as motion vectors of adjacent blocks are used to derive the PB. For entropy coding process, only one motion vector and a tag indicating which adjacent block is used are coded in bit-stream. Adding TMHMC scheme as an additional mode in MPEG internet video coding (TVC) platform, the bitrate saving is up to 12% at the same objective quality compared with anchor. Average bitrate saving is close to 6% over all test sequences. In addition, we also implement the conventional multi-hypothesis motion compensation (MHMC) scheme. 3% bitrate is further saved on average by TMHMC compared with conventional MHMC.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127482158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A fast coding algorithm based on inter-view correlations for 3D-HEVC","authors":"Guangsheng Chi, Xin Jin, Qionghai Dai","doi":"10.1109/VCIP.2014.7051584","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051584","url":null,"abstract":"The newly published 3D-HEVC has received a remarkable response due to its high compression efficiency which is based on High Efficiency Video Coding (HEVC). However, the complexity of its encoding process is also large as a result of introducing the coding units (CU) size decision process together with the rate distortion optimization (RDO) process. In this paper, a fast coding algorithm making good use of the interview correlations is proposed. With the inter-view correlation statistical analysis, the CU depth candidates of the dependent views can be predicted from the independent view instead of the brute force RDO process in determining CU depth. The experimental results show that the proposed method saves 51% time in texture coding and the loss is negligible.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114203099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Depth inference with convolutional neural network","authors":"Hu Tian, Bojin Zhuang, Yan Hua, A. Cai","doi":"10.1109/VCIP.2014.7051531","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051531","url":null,"abstract":"The goal of depth inference from a single image is to assign a depth to each pixel in the image according to the image content. In this paper, we propose a deep learning model for this task. This model consists of a convolutional neural network (CNN) with a linear regressor being as the last layer. The network is trained with raw RGB image patches cropped by a large window centered at each pixel of an image to extract feature representations. Then the depth map of a test image can be efficiently obtained by forward-passing the image through the trained model plus a simple up-sampling. Contrary to most previous methods based on graphical model and depth sampling, our method alleviates the needs for engineered features and for assumptions about semantic information of the scene. We achieve state-of-the-art results on Make 3D dataset, while keeping low computational time at the test time.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123853916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast intra mode decision for HEVC based on texture characteristic from RMD and MPM","authors":"Dongdong Zhang, Youwei Chen, E. Izquierdo","doi":"10.1109/VCIP.2014.7051618","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051618","url":null,"abstract":"In this paper, we proposed a fast intra mode decision algorithm for HEVC to further reduce the candidate modes of RDOQ or even skip RDOQ for a PU, which exploited not only the texture consistency of neighbouring PUs reflected by the relation between the optimal RMD mode and MPM, but also the texture characteristic in a PU reflected by the best two RMD modes. Experimental results show that our proposed algorithm can reduce 31.3% intra coding time with 0.56% BD-rate loss on average.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121629845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid modeling of natural image in wavelet domain","authors":"Chongwu Tang, Xiaokang Yang, Guangtao Zhai","doi":"10.1109/VCIP.2014.7051501","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051501","url":null,"abstract":"Natural image is characterized by its highly kurtotic and heavy-tailed distribution in wavelet domain. These typical non-Gaussian statistics are commonly described by generalized Gaussian density (GGD) or α-stable distribution. However, each of the two models has its own deficiency to capture the variety and complexity of real world scenes. Considering the statistical properties of GGD and α-stable distributions respectively, in this paper we propose a hybrid statistical model of natural image's wavelet coefficients which is better in describing the leptokurtosis and heavy tails simultaneously. Based on a linearly weighted fusion of GGD and α-stable functions, we derive the optimal parametric hybrid model, and measure the model accuracy using Kullback-Leibler divergence, which evaluates the similarity between two probability distributions. Experiment results and comparative studies demonstrate that the proposed hybrid model is closer to the true distribution of natural image's wavelet coefficients than single GGD or α-stable modeling.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121887276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved disparity vector derivation for inter-view residual prediction in 3D-HEVC","authors":"Shiori Sugimoto, S. Shimizu, Akira Kojima","doi":"10.1109/VCIP.2014.7051508","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051508","url":null,"abstract":"Inter-view residual prediction and advanced residual prediction (ARP) are efficient tools for coding dependent views of 3D video. It can predict the residue of motion compensated prediction (MCP) using the additional derived disparity vector. However, its coding performance depends on the accuracy of disparity vector derivation. In this paper, we propose the improved disparity vector derivation scheme for ARP. In the proposed scheme, the disparity vector is derived from the corresponding block in the reference picture. And that corresponding block is pointed by the same as MCP. Moreover, the disparity vector can be derived not only from the current reference block, but also from the blocks on the all other reference pictures included in the current reference picture lists. In addition, the disparity vector can be derived from both of the blocks predicted by disparity compensated prediction (DCP) and the blocks predicted by MCP and ARP because the derived disparity vector for ARP is stored in additional disparity vector field. Experimental results show that 0.2% bitrate reduction of synthesized views, and up to 0.5% for each dependent views in the reference software of 3D-HEVC.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115606292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast and smooth 3D reconstruction using multiple RGB-Depth sensors","authors":"D. Alexiadis, D. Zarpalas, P. Daras","doi":"10.1109/VCIP.2014.7051532","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051532","url":null,"abstract":"In this paper, the problem of real-time, full 3D reconstruction of foreground moving objects, an important task for Tele-Immersion applications, is addressed. More specifically, the proposed reconstruction method receives input from multiple consumer RGB-Depth cameras. A fast and efficient method to calibrate the sensors in initially described. More importantly, an efficient method to smoothly fuse the captured raw point sets is then presented, followed by a volumetric method to produce watertight and manifold meshes. Given the implementation details, the proposed method can operate at high frame rates. The experimental results, with respect to reconstruction quality and rates, verify the effectiveness of the proposed methodology.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124468517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhongpai Gao, Guangtao Zhai, Jiantao Zhou, Xiongkuo Min, Chunjia Hu
{"title":"Information security display via uncrowded window","authors":"Zhongpai Gao, Guangtao Zhai, Jiantao Zhou, Xiongkuo Min, Chunjia Hu","doi":"10.1109/VCIP.2014.7051604","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051604","url":null,"abstract":"With the booming of visual media, people pay more and more attention to privacy protection in public environments. Most existing research on information security such as cryptography and steganography is mainly concerned about transmission and yet little has been done to prevent the information displayed on screens from reaching eyes of the bystanders. This \"security of the last foot (SOLF)\" problem, if left without being taken care of, will inevitably lead to the total failure of a trustable information communication system. To deal with the SOLF problem, for the application of text-reading, we proposed an eye tracking based solution using the newly revealed concept of uncrowded window from vision research. The theory of uncrowded window suggests that human vision can only effectively recognize objects inside a small window. Object features outside the window may still be detectable but the feature detection results cannot be efficiently combined properly and therefore those objects will not be recognizable. We use eye-tracker to locate fixation points of the authorized reader in real time, and only the area inside the uncrowded window displays the private information we want to protect. A number of dummy windows with fake messages are displayed around the real uncrowded window as diversions. And without the precise knowledge about the fixations of the authorized reader, the chance for bystanders to capture the private message from those surrounding area and the dummy windows is very low. Meanwhile, since the authorized reader can only read within the uncrowded window, detrimental impact of those dummy windows is almost negligible. The proposed prototype system was written in C++ with SDKs of Direct3D, Tobii Gaze SDK, CEGUI, MuPDF, OpenCV and etc. Extended demonstration of the system will be provided to show that the proposed method is an effective solution to SOLF problem of information communication and display.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116755016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lossless compression of JPEG coded photo albums","authors":"Hao Wu, Xiaoyan Sun, Jingyu Yang, Feng Wu","doi":"10.1109/VCIP.2014.7051625","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051625","url":null,"abstract":"The explosion in digital photography poses a significant challenge when it comes to photo storage for both personal devices and the Internet. In this paper, we propose a novel lossless compression method to further reduce the storage size of a set of JPEG coded correlated images. In this method, we propose jointly removing the inter-image redundancy in the feature, spatial, and frequency domains. For each album, we first organize the images into a pseudo video by minimizing the global predictive cost in the feature domain. We then introduce a disparity compensation method to enhance the spatial correlation between images. Finally, the redundancy between the compensated signal and the corresponding target image is adaptively reduced in the frequency domain. Moreover, our proposed scheme is able to losslessly recover not only raw images but also JPEG files. Experimental results demonstrate the efficiency of our proposed lossless compression, which achieves more than 12% bit-saving on average compared with JPEG coded albums.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133532929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}