{"title":"Snakes assisted food image segmentation","authors":"Y. He, N. Khanna, C. Boushey, E. Delp","doi":"10.1109/MMSP.2012.6343437","DOIUrl":"https://doi.org/10.1109/MMSP.2012.6343437","url":null,"abstract":"In this paper we describe an image segmentation method for segmenting food items in images used for dietary assessment. Dietary assessment methods used to determine the foods and beverages consumed at a meal are essential for understanding the link between diet and health. Snakes, or active contours, are used extensively to locate object boundaries and segment images. Experimental results using classical snakes on food images show the problems associated with contour initialization and poor detection performance for food images. In this paper, we explore various methods of contour initialization and integrate a background removal method to improve the performance of food image segmentation. We describe the details of the proposed food image segmentation method and also evaluate our segmentation approach on food images.","PeriodicalId":325274,"journal":{"name":"2012 IEEE 14th International Workshop on Multimedia Signal Processing (MMSP)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115052651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ke Gu, Guangtao Zhai, Xiaokang Yang, Li Chen, Wenjun Zhang
{"title":"Nonlinear additive model based saliency map weighting strategy for image quality assessment","authors":"Ke Gu, Guangtao Zhai, Xiaokang Yang, Li Chen, Wenjun Zhang","doi":"10.1109/MMSP.2012.6343461","DOIUrl":"https://doi.org/10.1109/MMSP.2012.6343461","url":null,"abstract":"Most state-of-the-art image quality metrics are based on the two-step approach: local distortion/fidelity measurement and pooling. During the pooling stage, many weighting strategies have been proposed incorporating properties of the distortion itself, various masking effects and visual attention. Recently, researchers have devoted great enthusiasm and effort to the improvement of image quality assessment using visual saliency models. In this research, it is noticed that visual saliency features of both the original image and the distorted one have impacts on the process of image quality assessment. To reduce the overlapping effects, a nonlinear additive model is proposed to integrate saliency features from the original and distorted images towards improved error weighting results. Our extensive experimental studies on four publicly available image databases (LIVE, TID2008, CSIQ and A57) indicate that the proposed improved nonlinear additive model based saliency map weighting strategy constantly leads to higher prediction accuracy for image quality assessment than traditional methods.","PeriodicalId":325274,"journal":{"name":"2012 IEEE 14th International Workshop on Multimedia Signal Processing (MMSP)","volume":"85 10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134127837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel local audio fingerprinting algorithm","authors":"Mani Malekesmaeili, R. Ward","doi":"10.1109/MMSP.2012.6343429","DOIUrl":"https://doi.org/10.1109/MMSP.2012.6343429","url":null,"abstract":"A local fingerprinting algorithm is proposed for the purpose of audio copy detection. The proposed algorithm is robust to noise as well as tempo and pitch modifications of the audio signal. The fingerprints are extracted from adaptively scaled patches of the time-chroma representation of the audio signal. The proposed time-chroma representation, converts tempo change and pitch shift attacks on an audio signal to scaling and circular shift attacks on images, respectively. The proposed algorithm is shown to outperform the state-of-the-art.","PeriodicalId":325274,"journal":{"name":"2012 IEEE 14th International Workshop on Multimedia Signal Processing (MMSP)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134145654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Regularized sequential selection and backtracking removal for CS atom matching","authors":"Chunyan Zeng, Lihong Ma, Ming-hui Du, Jing Tian","doi":"10.1109/MMSP.2012.6343442","DOIUrl":"https://doi.org/10.1109/MMSP.2012.6343442","url":null,"abstract":"Atom selection is crucial to compressive sensing (CS) reconstruction by orthogonal matching pursuit (OMP), where the look-ahead (LA) OMP algorithm (LAOMP) evaluated final effects of all the LA atoms before they were included into a support set, certainly, a high computation burden has to be suffered. This paper modifies LAOMP method by two folds: 1) Regularization (R-LAOMP) is introduced to restrict the atom selection by similar small residuals, while mutual effects of new selected atoms are considered to alleviate the high computation costs. 2) Backtracking-based (LA-BOMP) atom pruning is employed to remove the most mismatching atoms in support sets to balance the accuracy and the random disturbance in optimization procedures. Accordingly this regularized forward atom evaluation combining backward atom deleting method (R-LA-BOMP) leads to a significant improvement in LAOMP, while a trade-off between performance and complexity is achieved. Experiments of the regularized atom selection and the backtracking pruning algorithms are performed on Gaussian sparse signals, 0-1 sparse signals and speech voices and the results are given.","PeriodicalId":325274,"journal":{"name":"2012 IEEE 14th International Workshop on Multimedia Signal Processing (MMSP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133436449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wolfgang Schnurrer, T. Richter, Jürgen Seiler, André Kaup
{"title":"Analysis of mesh-based motion compensation in wavelet lifting of dynamical 3-D+t CT data","authors":"Wolfgang Schnurrer, T. Richter, Jürgen Seiler, André Kaup","doi":"10.1109/MMSP.2012.6343432","DOIUrl":"https://doi.org/10.1109/MMSP.2012.6343432","url":null,"abstract":"Factorized in the lifting structure, the wavelet transform can easily be extended by arbitrary compensation methods. Thereby, the transform can be adapted to displacements in the signal without losing the ability of perfect reconstruction. This leads to an improvement of scalability. In temporal direction of dynamic medical 3-D+t volumes from Computed Tomography, displacement is mainly given by expansion and compression of tissue. We show that these smooth movements can be well compensated with a mesh-based method. We compare the properties of triangle and quadrilateral meshes. We also show that with a mesh-based compensation approach coding results are comparable to the common slice wise coding with JPEG 2000 while a scalable representation in temporal direction can be achieved.","PeriodicalId":325274,"journal":{"name":"2012 IEEE 14th International Workshop on Multimedia Signal Processing (MMSP)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132383305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junichi Ishida, Gene Cheung, Akira Kubota, Antonio Ortega
{"title":"Quality-optimized encoding of JPEG images using transform domain sparsification","authors":"Junichi Ishida, Gene Cheung, Akira Kubota, Antonio Ortega","doi":"10.1109/MMSP.2012.6343453","DOIUrl":"https://doi.org/10.1109/MMSP.2012.6343453","url":null,"abstract":"To account for the unique characteristics and limitations of the human visual system (HVS) when perceiving images, a variety of perceptual quality metrics have been proposed in the literature. Tailoring rate-distortion (RD) optimization for each metric is cumbersome and time-consuming. In this paper, we propose a general RD-optimization strategy called “transform domain bounding box” (BB) that can easily adapt to different quality metrics for JPEG-like block-based encoding of images. First, we define an objective function that is a weighted sum of the l0-norm of the transform coefficients (a proxy for rate) and distortion from the transform domain representation. Next, for a given distortion target τ, we define a don't care region (DCR) that specifies a search region of representations with distortion ≤τ. We then show that the sparsest transform domain representation (lowest encoding rate) inside a BB that tightly contains the DCR can be constructed efficiently. Varying τ to induce different DCRs and corresponding BBs results in a set of constructed sparse representations of different sparsity counts, and the one that optimally trades off rate and distortion can be easily identified as solution to our objective. We show that our proposed BB strategy can be easily re-targeted for three common quality metrics: MSE, MSE-HVS-M and SSIM. Experimental results show that our BB strategy outperformed unoptimized JPEG compression by up to 1dB in PSNR when distortion metric is MSE, up to 2dB when metric is MSE-HVS-M, and up to 0.005 when metric is SSIM.","PeriodicalId":325274,"journal":{"name":"2012 IEEE 14th International Workshop on Multimedia Signal Processing (MMSP)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122373951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing recommended video lists for Youtube-like social media","authors":"Xiaoqiang Ma, Haiyang Wang, Haitao Li, Jiangchuan Liu, Hongbo Jiang","doi":"10.1109/MMSP.2012.6343448","DOIUrl":"https://doi.org/10.1109/MMSP.2012.6343448","url":null,"abstract":"Youtube-like video sharing sites (VSSes) have gained increasing popularity in recent years. Meanwhile, Facebook-like online social networks (OSNs), have seen their tremendous success in connecting people of common interests. These two new generation of networked services are now bridged in that many users of OSNs share video contents originating from VSSes with their friends, and it has been shown that a significant portion of views of VSSes are attributed to this sharing scheme of social networks. To understand how the video sharing behavior, which is largely based on social relationship, impacts users' viewing pattern, we have conducted a long-term measurement with RenRen and YouKu, the largest online social network and the largest video sharing site in China, respectively. We show that social friends are more likely to have common interests and their sharing behaviors provide guidance to enhance recommended video lists. In this paper, we take a first step toward learning OSN video sharing patterns for VSS video recommendation. An auto-encoder model is developed to learn the social similarity of different videos in terms of their sharing in OSN. We therefore propose a similarity-based strategy to enhance recommended video lists for VSSes. Evaluation results demonstrate that this strategy can remarkably improve the precision in VSSes, as compared to state-of-the-art strategies without social information.","PeriodicalId":325274,"journal":{"name":"2012 IEEE 14th International Workshop on Multimedia Signal Processing (MMSP)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125098969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient binary representation of delta Quantization Parameter for High Efficiency Video Coding","authors":"K. Chono","doi":"10.1109/MMSP.2012.6343414","DOIUrl":"https://doi.org/10.1109/MMSP.2012.6343414","url":null,"abstract":"This paper proposes an efficient binary representation of delta Quantization Parameter (QP) for High Efficiency Video Coding (HEVC). Video encoders adapt QPs of coding blocks for visual quality optimization and rate control. Although they send only delta QPs (dQPs) obtained by causal prediction, the side information overhead is expensive. Therefore the HEVC design necessitates an efficient dQP coding. The proposed scheme converts a dQP to a binary string in which the first and second bins indicate the significance and sign of the dQP respectively and the rest represents the magnitude minus 1. Furthermore, it detects and truncates redundant bins in the binary strings by using the sign and an admissible dQP range. Thus it reduces the length of dQP binary strings and improves dQP coding efficiency. Simulation results using HEVC reference software demonstrate that the proposed scheme improves the dQP coding efficiency by 6% while reducing its bin rates by 25%.","PeriodicalId":325274,"journal":{"name":"2012 IEEE 14th International Workshop on Multimedia Signal Processing (MMSP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126129954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Gschwandtner, Jutta Hämmerle-Uhl, Y. Höller, M. Liedlgruber, A. Uhl, A. Vécsei
{"title":"Improved endoscope distortion correction does not necessarily enhance mucosa-classification based medical decision support systems","authors":"Michael Gschwandtner, Jutta Hämmerle-Uhl, Y. Höller, M. Liedlgruber, A. Uhl, A. Vécsei","doi":"10.1109/MMSP.2012.6343433","DOIUrl":"https://doi.org/10.1109/MMSP.2012.6343433","url":null,"abstract":"Distortion correction in two variants is applied to endoscopic duodenal imagery aiming at an improvement of automated classification of celiac disease affected mucosa patches. In a set of heterogeneous feature extraction techniques, only geometry and shape related ones are able to benefit from distortion correction, while for others, even a decrease of classification accuracy is observed. Different types of distortion correction do not lead to significantly different behaviour in the observed application scenario.","PeriodicalId":325274,"journal":{"name":"2012 IEEE 14th International Workshop on Multimedia Signal Processing (MMSP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130450583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalable depth maps with R-D optimized embedding","authors":"R. Mathew, D. Taubman, P. Zanuttigh","doi":"10.1109/MMSP.2012.6343452","DOIUrl":"https://doi.org/10.1109/MMSP.2012.6343452","url":null,"abstract":"Recent work has highlighted the importance of incorporating geometry information into the compression of depth maps. In prior approaches however the geometry information is not resolution scalable nor amenable to embedded coding. In this paper we propose a novel compression strategy for depth maps that incorporates geometry information while achieving the goals of scalability and embedded representation. Our scheme involves two separate image pyramid structures, one for breakpoints and other for sub-band samples produced by a breakpoint-adaptive transform. Breakpoints capture geometric attributes and are amenable to scalable coding. We develop an R-D optimization framework for the breakpoint data. We also use a variation of the EBCOT scheme to produce embedded bit-streams for both the breakpoint and sub-band data, allowing them to be independently and incrementally sequenced based on R-D considerations.","PeriodicalId":325274,"journal":{"name":"2012 IEEE 14th International Workshop on Multimedia Signal Processing (MMSP)","volume":"219 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131763084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}