Wan Siti Halimatul Munirah Wan Ahmad, M. F. A. Fauzi
{"title":"Comparison of different feature extraction techniques in content-based image retrieval for CT brain images","authors":"Wan Siti Halimatul Munirah Wan Ahmad, M. F. A. Fauzi","doi":"10.1109/MMSP.2008.4665130","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665130","url":null,"abstract":"Content-based image retrieval (CBIR) system helps users retrieve relevant images based on their contents. A reliable content-based feature extraction technique is therefore required to effectively extract most of the information from the images. These important elements include texture, colour, intensity or shape of the object inside an image. CBIR, when used in medical applications, can help medical experts in their diagnosis such as retrieving similar kind of disease and patientpsilas progress monitoring. In this paper, several feature extraction techniques are explored to see their effectiveness in retrieving medical images. The techniques are Gabor transform, discrete wavelet frame, Hu moment invariants, Fourier descriptor, gray level histogram and gray level coherence vector. Experiments are conducted on 3,032 CT images of human brain and promising results are reported.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125749014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient stereo bitrate allocation for fully scalable audio codec","authors":"Te Li, S. Rahardja, S. Koh","doi":"10.1109/MMSP.2008.4665206","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665206","url":null,"abstract":"The bit allocation algorithm for stereo channels in MPEG-4 scalable lossless coding (SLS) is not optimized. A perceptually enhanced stereo bit allocation algorithm for fully scalable audio coding is presented in this paper. According to the energy distribution in different channels, the bitrate is allocated in a much more efficient manner. Experiment results show that the proposed method significantly improves the perceptual quality of the fully scalable audio at various bitrates without introducing any new side information.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127341299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Motion modeling with separate quad-tree structures for geometry and motion","authors":"R. Mathew, D. Taubman","doi":"10.1109/MMSP.2008.4665106","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665106","url":null,"abstract":"Quad-tree structures are often used to model motion between frames of a video sequence. However, a fundamental limitation of the quad-tree structure is that it can only capture horizontal and vertical edge discontinuities at dyadically related locations. To address this limitation recent work has focused on the introduction of geometry information to nodes of tree structured motion representations. In this paper we explore modeling boundary geometry and motion with separate quadtree structures. Recent work into quad-tree representations have also highlighted the benefits of leaf merging. We extend the leaf merging paradigm to incorporate separate tree structures for boundary geometry and motion. To achieve an efficient joint representation we introduce polynomial motion models and piecewise linear boundary geometry to our quad-tree structures. Experimental results show that the approach taken in this paper provides significant improvement over previous quad-tree based motion representation schemes.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126796832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Standard-compliant multiple description image coding by spatial multiplexing and constrained least-squares restoration","authors":"Xiangjun Zhang, Xiaolin Wu","doi":"10.1109/MMSP.2008.4665102","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665102","url":null,"abstract":"We propose a practical standard-compliant multiple description (MD) image coding technique. Multiple descriptions of an image are generated in the spatial domain by an adaptive prefiltering and uniform down sampling process. The resulting side descriptions are conventional square sample grids that are interleaved with one the other. As such each side description can be coded by any of the existing image compression standards. A side decoder reconstructs the input image by first decompressing the down-sampled image and then solving a least-squares inverse problem, guided by a two-dimensional windowed piecewise autoregressive model. The central decoder is algorithmically similar to the side decoder, but it improves the reconstruction quality by using received side descriptions as additional constraints when solving the underlying inverse problem. Compared with its predecessors the proposed image MD technique offers the lowest encoder complexity, complete standard compliance, competitive rate-distortion performance, and superior subjective quality.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127208499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Ruwwe, B. Keck, Oliver Rusch, U. Zölzer, Xavier Loison
{"title":"Image registration by means of 3D octree correlation","authors":"C. Ruwwe, B. Keck, Oliver Rusch, U. Zölzer, Xavier Loison","doi":"10.1109/MMSP.2008.4665132","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665132","url":null,"abstract":"With no calibrated camera setup at hand, careful inspection of the imagery is needed to guarantee a feasible 3D reconstruction result based upon the images. We propose a new approach for image registration based on reconstructed 3D octrees by voxel carving. Correlation of these models gives rise to a translation offset for a maximum intersection between different models from different images. Projecting the resulting three-dimensional translation offsets back into the image plane results in two two-dimensional image offsets that are used for the image registration.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127580306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graphical modeling and decoding of human actions","authors":"W. Li, Zhengyou Zhang, Zicheng Liu","doi":"10.1109/MMSP.2008.4665070","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665070","url":null,"abstract":"This paper presents a graphical model for learning and recognizing human actions. Specifically, we propose to encode actions in a weighted directed graph, referred to as action graph, where nodes of the graph represent salient postures that are used to characterize the actions and shared by all actions. The weight between two nodes measures the transitional probability between the two postures. An action is encoded as one or multiple paths in the action graph. The salient postures are modeled using Gaussian mixture models (GMM). Both the salient postures and action graph are automatically learned from training samples through unsupervised clustering and expectation and maximization (EM) algorithm. Experimental results have verified the performance of the proposed model, its tolerance to noise and viewpoints and its robustness across different subjects and datasets.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122531784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jingyuan Wang, Lifeng Sun, Bin Li, Meng Zhang, Shiqiang Yang
{"title":"CCL-SVC: Optimizing user experience of broadcasting video on computation capability limited handheld devices","authors":"Jingyuan Wang, Lifeng Sun, Bin Li, Meng Zhang, Shiqiang Yang","doi":"10.1109/MMSP.2008.4665117","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665117","url":null,"abstract":"In this paper, we propose a novel scheme using computing complexity layered scalable video coding (CCLSVC) to optimize the user experience of broadcasting video in the computing capability limited handheld terminals. To address the heterogeneity of computing capability among different handheld devices, we employ hierarchal B reference structure of SVC to divide the frames into multiple computing complexity layers (CC Layers) in server side. The handheld clients simply choose to decode the frames in their corresponding layers in terms of their computation capability to maximize the video PSNR. We have proved that the optimal CC Layers division problem is a precedence constrained scheduling problem, which is an NP-complete problem. And we further propose our fast greedy method to approximately get optimized broadcasting video playback PSNR. The simulation shows that our method is superior to temporal SVC and random frame discarding method.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123861336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sparse approximations for joint source-channel coding","authors":"G. Rath, C. Guillemot, J. Fuchs","doi":"10.1109/MMSP.2008.4665126","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665126","url":null,"abstract":"This paper considers the application of sparse approximations in a joint source-channel (JSC) coding framework. The considered JSC coded system employs a real number BCH code on the input signal before the signal is quantized and further processed. Under an impulse channel noise model, the decoding of error is posed as a sparse approximation problem. The orthogonal matching pursuit (OMP) and basis pursuit (BP) algorithms are compared with the syndrome decoding algorithm in terms of mean square reconstruction error. It is seen that, with a Gauss-Markov source and Bernoulli-Gaussian channel noise, the BP outperforms the syndrome decoding and the OMP at higher noise levels. In the case of image transmission with channel bit errors, the BP outperforms the other two decoding algorithms consistently.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123996434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Singular block Toeplitz matrix approximation and application to multi-microphone speech dereverberation","authors":"Samir-Mohamad Omar, D. Slock","doi":"10.1109/MMSP.2008.4665048","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665048","url":null,"abstract":"We consider the blind multichannel dereverberation problem for a single source. We have shown before [5] that the single-input multi-output (SIMO) reverberation filter can be equalized blindly by applying MIMO Linear Prediction (LP) to its output (after SISO input pre-whitening). In this paper, we investigate the LP-based dereverberation in a noisy environment, and/or under acoustic channel length underestimation. Considering ambient noise and late reverberation as additive noises, we propose to introduce a postfilter that transforms the MIMO prediction filter into a somewhat longer equalizer. The postfilter allows to equalize to non-zero delay. Both MMSE-ZF and MMSE design criteria are considered here for the postfilter.We also focus here on computationally efficient (FFT based) block Toeplitz covariance matrix enhancement that enforces the SIMO filtered source plus white noise structure before applying MIMO LP. A second suggested refinement is an iterative refinement between SISO and MIMO LP. Simulations show that the proposed scheme is robust in noisy environments, and performs better compared to the classic Delay-&-Predict equalizer and the Delay-&-Sum beamformer.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116736593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Multiple Experts System for personal identification using facial behaviour biometrics","authors":"Pohsiang Tsai, Tich Phuoc Tran, T. Hintz, T. Jan","doi":"10.1109/MMSP.2008.4665158","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665158","url":null,"abstract":"Physiological and/or behavioural characteristics of humans such as face, gait and/or voice have been used in biometric recognition technology. Apart from these characteristics (which have been reported in the literature), the hypothesis of this research was to investigate if facial behaviour could be used for human identification. We analysed and proposed a multiple experts system, called Adaptive Multiple Experts System (AMES), for validating our hypothesis and analysis. We used the Japanese Female Facial Expression (JAFFE) database as it provides the facial behaviour traits for data collection. The experimental results indicate that facial behaviours may provide information about individual difference and, thus may be used as another behavioural biometric.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115201708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}