{"title":"Perpetual video camera for Internet-of-things","authors":"Yen-kuang Chen, Shao-Yi Chien","doi":"10.1109/VCIP.2012.6410856","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410856","url":null,"abstract":"Digital sensing, processing, and communication capabilities will be ubiquitously embedded into everyday objects, turning them into an Internet-of-things (IoT, also known as, machine-to-machine, M2M). More importantly, everyday objects will become data generators, with sensors everywhere continuously collecting a large quantity of data about their context and use, processors everywhere analyzing and inferring useful knowledge from the data, and finally communication radios transmitting and exchanging useful knowledge with other objects and to “cloud” based resources. This is the next-generation Internet - rather than data mainly produced by humans and for humans, in the new machine-to-machine-era Internet, data are generated by machines (sensors), communicated without human involvement to other machines (servers or other computer systems) for automated processing to enable automated or human actions, driving speeds and scales unseen by the existing Internet. Distributed video cameras will play important roles in various IoT/M2M applications. To resolve the problems of high data rate, high power consumption, and large deployment cost of large-scale distributed video sensors, perpetual video cameras, where net energy consumption is almost zero, are required. Many technologies and design challenges are introduced for designing such cameras, such as energy harvesting, distributed video coding, distributed video analysis, and the associated VLSI designs. To bring up these issues and challenges, in this tutorial, we will provide (1) an overview of challenges/opportunities in M2M, (2) an introduction of distributed smart cameras in M2M applications, (3) the analysis of power consumption of distributed cameras, (4) an introduction of energy harvesting techniques, (5) distributed video coding and (6) distributed video analysis techniques, where both the state-of-the-art works and possible future research directions will be shown. Finally, we will conclude this tutorial with some possible applications.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130186916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Wavelet domain image super-resolution from digital cinema to ultrahigh definition television by dividing noise component","authors":"Y. Matsuo, Shinya Iwasaki, Y. Yamamura, J. Katto","doi":"10.1109/VCIP.2012.6410830","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410830","url":null,"abstract":"We propose a novel wavelet domain image super-resolution method from digital cinema to ultrahigh definition television considering cinema noise component. The proposed method features that spatial resolution of an original image is expanded by synthesis of super-resolved signal and noise components respectively after dividing an original image into signal and noise components. Dividing noise component uses spatio-temporal wavelet decomposition based on frequency spectrum analysis of cinema noise. And super-resolution parameters are optimized by comparing size-reduced super-resolution images with an original image. Experimental results showed that a super-resolution image using the proposed method has a subjectively better appearance and an objectively better peak signal-to-noise ratio measurement than conventional methods.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114794256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generalized MMSD feature extraction using QR decomposition","authors":"Ning Zheng, L. Qi, Lei Gao, L. Guan","doi":"10.1109/VCIP.2012.6410757","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410757","url":null,"abstract":"Multiple Maximum scatter difference (MMSD) discriminant criterion is an effective feature extraction method that computes the discriminant vectors from both the range of the between-class scatter matrix and the null space of the within-class scatter matrix. However, singular value decomposition (SVD) of two times is involved in MMSD, making this method impractical for high dimensional data. In this paper, we propose a novel method for feature extraction and classification based on MMSD criterion, called generalized MMSD (GMMSD), which employs QR decomposition rather than SVD. Unlike MMSD, GMMSD does not require the computation of the whole scatter matrix. Instead, it computes the discriminant vectors from both the range of whitenizated input data matrix and the null space of the within-class scatter matrix. We evaluate the effectiveness of the GMMSD method in terms of classification accuracy in the reduced dimensional space. Our experiments on two facial expression databases demonstrate that the GMMSD method provides favorable performance in terms of both recognition accuracy and computational efficiency.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125244895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Schaefer, M. Tallyn, Daniel Felton, David Edmundson, William Plant
{"title":"Intuitive mobile image browsing on a hexagonal lattice","authors":"G. Schaefer, M. Tallyn, Daniel Felton, David Edmundson, William Plant","doi":"10.1109/VCIP.2012.6410864","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410864","url":null,"abstract":"Following miniaturisation of cameras and their integration into mobile devices such as smartphones combined with the intensive use of the latter, it is likely that in the near future the majority of digital images will be captured using such devices rather than using dedicated cameras. Since many users decide to keep their photos on their mobile devices, effective methods for managing these image collections are required. Common image browsers prove to be only of limited use, especially for large image sets [1].","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124913734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Action retrieval based on generalized dynamic depth data matching","authors":"Lujun Chen, H. Yao, Xiaoshuai Sun","doi":"10.1109/VCIP.2012.6410774","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410774","url":null,"abstract":"With the great popularity and extensive application of Kinect, the Internet is sharing more and more depth data. To effectively use plenty of depth data would make great sense. In this paper, we propose a generalized dynamic depth data matching framework for action retrieval. Firstly we focus on single depth image matching utilizing both depth and shape feature. The depth feature used in our method is straightforward but proved to be very effective and robust for distinguishing various human actions. Then, we adopt shape context, which is widely used in shape matching, in order to strengthen the robustness of our matching strategy. Finally, we utilize Dynamic Time Warping to measure temporal similarity between two depth video sequences. Experiments based on a dataset of 17 classes of actions from 10 different individuals demonstrate the effectiveness and robustness of our proposed matching strategy.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126128127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Han Zhang, Adarsh K. Ramasubramonian, K. Kar, J. Woods
{"title":"Distortion-optimal receiver grouping for MD-FEC coded video streaming","authors":"Han Zhang, Adarsh K. Ramasubramonian, K. Kar, J. Woods","doi":"10.1109/VCIP.2012.6410806","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410806","url":null,"abstract":"Multiple Description with Forward Error Correction (MD-FEC) coding provides the flexibility, easy adaptivity and distortion-rate optimality that are desirable for delivering streaming video in a network environment with time-varying bandwidth fluctuations and random packet losses. In this paper, we consider the issue of how diverse receivers of a video stream should be grouped - where each group receives a MD-FEC coded bitstream optimized for that group - so that the average video distortion is minimized across all receivers. We show that a sequential grouping solution is optimal for linear distortion-rate functions. For non-linear distortion-rate functions, while the optimal grouping structure may not be sequential in general, we observe that the approximation factor attained by the best sequential solution can be characterized in terms of the “degree of convexity” of the distortion-rate function. Numerical experiments with realistic distortion-rate functions reveal that the difference between the globally optimal grouping solution and the best sequential solution, is typically small. We provide a dynamic programming based polynomial-time algorithm to compute the best sequential solution.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115035156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Mittal, V. Jakhetiya, S. Jaiswal, O. Au, A. Tiwari, Dai Wei
{"title":"Bit-depth expansion using Minimum Risk Based Classification","authors":"G. Mittal, V. Jakhetiya, S. Jaiswal, O. Au, A. Tiwari, Dai Wei","doi":"10.1109/VCIP.2012.6410837","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410837","url":null,"abstract":"Bit-depth expansion is an art of converting low bit-depth image into high bit-depth image. Bit-depth of an image represents the number of bits required to represent an intensity value of the image. Bit-depth expansion is an important field since it directly affects the display quality. In this paper, we propose a novel method for bit-depth expansion which uses Minimum Risk Based Classification to create high bit-depth image. Blurring and other annoying artifacts are lowered in this method. Our method gives better objective (PSNR) and superior visual quality as compared to recently developed bit-depth expansion algorithms.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116452589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"WaveCast: Wavelet based wireless video broadcast using lossy transmission","authors":"Xiaopeng Fan, Ruiqin Xiong, Feng Wu, Debin Zhao","doi":"10.1109/VCIP.2012.6410743","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410743","url":null,"abstract":"Wireless video broadcasting is a popular application of mobile network. However, the traditional approaches have limited supports to the accommodation of users with diverse channel conditions. The newly emerged Softcast approach provides smooth multicast performance but is not very efficient in inter frame compression. In this work, we propose a new video multicast approach: WaveCast. Different from softcast, WaveCast utilizes motion compensated temporal filter (MCTF) to exploit inter frame redundancy, and utilizes conventional framework to transmit motion information such that the MVs can be reconstructed losslessly. Meanwhile, WaveCast transmits the transform coefficients in lossy mode and performs gracefully in multicast. In experiments, Wave-Cast outperforms softcast 2dB in video PSNR at low channel SNR, and outperforms H.264 based framework up to 8dB in broadcast.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116544072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel scheme to code object flags for video synopsis","authors":"Shizheng Wang, Heguang Liu, Danfeng Xie, Binwei Zeng","doi":"10.1109/VCIP.2012.6410771","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410771","url":null,"abstract":"With extending applications of digital surveillance video, the fast browsing technique for surveillance video has become a hot spot in the domain. As one of the most important supporting technologies, region of interest (ROI) information coding is a necessary part of fast browsing framework for surveillance video. Therefore, a novel coding method of object region flags and successful application of object mapping flags for the scalable browsing of surveillance video are proposed in this paper. This object region flags coding method includes both intra-frame coding and inter-frame coding, wherein, the former eliminates the redundancy in spatial domain by improving the scanning mode of region flags, the latter eliminates the redundancy in temporal domain, which cuts coding cost obviously and provides essential supports for video synopsis browsing of surveillance video.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"134 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116579026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}