{"title":"Exemplar-based image inpainting: Fast priority and coherent nearest neighbor search","authors":"R. Martínez-Noriega, A. Roumy, G. Blanchard","doi":"10.1109/MLSP.2012.6349810","DOIUrl":"https://doi.org/10.1109/MLSP.2012.6349810","url":null,"abstract":"Greedy exemplar-based algorithms for inpainting face two main problems, decision of filling-in order and selection of good exemplars from which the missing region is synthesized. We propose an algorithm that tackle these problems with improvements in the preservation of linear edges, and reduction of error propagation compared to well-known algorithms from the literature. Our improvement in the filling-in order is based on a combination of priority terms, previously defined by Criminisi, that better encourages the early synthesis of linear structures. The second contribution helps reducing the error propagation thanks to a better detection of outliers from the candidate patches carried. This is obtained with a new metric that incorporates the whole information of the candidate patches. Moreover, our proposal has significant lower computational load than most of the algorithms used for comparison in this paper.","PeriodicalId":262601,"journal":{"name":"2012 IEEE International Workshop on Machine Learning for Signal Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130004907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On surrogate supervision multiview learning","authors":"Gaole Jin, R. Raich","doi":"10.1109/MLSP.2012.6349759","DOIUrl":"https://doi.org/10.1109/MLSP.2012.6349759","url":null,"abstract":"In semi-supervised multi-view learning, the input vector is partitioned into two views and a classifier based on each view is sought after. In such settings, often examples which include the two views and a label are available [1]. In this paper, we are interested in the setting where a classifier for examples from one view is sought after although no labeled examples are provided for that view. Specifically, we consider the setting where labeled examples are provided only for the other view along with additional unlabeled examples of the two views jointly. To solve this problem, we present the Classification-Constrained Canonical Correlation Analysis (C4A) algorithm. We apply our algorithm to an audiovisual classification task. In comparison to two alternatives, the proposed method demonstrates superior performance.","PeriodicalId":262601,"journal":{"name":"2012 IEEE International Workshop on Machine Learning for Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128980279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Local distance metric learning for efficient conformal predictors","authors":"M. Pekala, Ashley J. Llorens, I-J. Wang","doi":"10.1109/MLSP.2012.6349813","DOIUrl":"https://doi.org/10.1109/MLSP.2012.6349813","url":null,"abstract":"Conformal prediction is a relatively recent approach to classification that offers a theoretical framework for generating predictions with precise levels of confidence. For each new object encountered, a conformal predictor outputs a set of class labels that contains the true label with probability at least 1 - ∈, where ∈ is a user-specified error rate. The ability to predict with confidence can be extremely useful, but in many real-world applications unambiguous predictions consisting of a single class label are preferred. Hence it is desirable to design conformal predictors to maximize the rate of singleton predictions, termed the efficiency of the predictor. In this paper we derive a novel criterion for maximizing efficiency for a certain class of conformal predictors, show how concepts from local distance metric learning can provide a useful bound for maximizing this criterion, and demonstrate efficiency gains on real-world datasets.","PeriodicalId":262601,"journal":{"name":"2012 IEEE International Workshop on Machine Learning for Signal Processing","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123082904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emre Yilmaz, J. Gemmeke, Dirk Van Compernolle, H. V. hamme
{"title":"Noise-robust digit recognition with exemplar-based sparse representations of variable length","authors":"Emre Yilmaz, J. Gemmeke, Dirk Van Compernolle, H. V. hamme","doi":"10.1109/MLSP.2012.6349738","DOIUrl":"https://doi.org/10.1109/MLSP.2012.6349738","url":null,"abstract":"This paper introduces an exemplar-based noise-robust digit recognition system in which noisy speech is modeled as a sparse linear combination of clean speech and noise exemplars. Exemplars are rigid long speech units of different lengths, i.e. no warping mechanism is used for exemplar matching to avoid poor time alignments that would otherwise be provoked by the noise and the natural duration distribution of each unit in the training data is preserved. Speech and noise separation is performed by applying non-negative sparse coding using a separate exemplar dictionary for each labeled unit (in this case half-digits) rather than a single dictionary of all units. This approach does not only provide better classification of speech units but also models the temporal structure of speech and noise more accurately. The system performance is evaluated on the AURORA-2 database. The results show that the proposed system performs significantly better than a comparable system using a single dictionary at positive SNR levels.","PeriodicalId":262601,"journal":{"name":"2012 IEEE International Workshop on Machine Learning for Signal Processing","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125091860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Blanca Florentino-Liaño, N. O’Mahony, Antonio Artés-Rodríguez
{"title":"Long term human activity recognition with automatic orientation estimation","authors":"Blanca Florentino-Liaño, N. O’Mahony, Antonio Artés-Rodríguez","doi":"10.1109/MLSP.2012.6349789","DOIUrl":"https://doi.org/10.1109/MLSP.2012.6349789","url":null,"abstract":"This work deals with the elimination of sensitivity to sensor orientation in the task of human daily activity recognition using a single miniature inertial sensor. The proposed method detects time intervals of walking, automatically estimating the orientation in these intervals and transforming the observed signals to a “virtual” sensor orientation. Classification results show that excellent performance, in terms of both precision and recall (up to 100%), is achieved, for long-term recordings in real-life settings.","PeriodicalId":262601,"journal":{"name":"2012 IEEE International Workshop on Machine Learning for Signal Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117172510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Differential edit distance as a countermeasure to video scene ambiguity","authors":"P. Sidiropoulos, V. Mezaris, Y. Kompatsiaris","doi":"10.1109/MLSP.2012.6349722","DOIUrl":"https://doi.org/10.1109/MLSP.2012.6349722","url":null,"abstract":"In this work the problem of how to evaluate video scene segmentation results is examined. The evaluation, which is typically conducted by comparison of the experimental output of scene segmentation algorithms with a ground-truth temporal decomposition, often suffers from ambiguity in the definition of the ground truth. To alleviate this drawback the use of a string comparison measure, called differential edit distance (DED), is proposed. After defining video scene segmentation evaluation as a string comparison problem, the proposed measure is applied to limit the effect of scene segmentation ambiguity in the performance estimation uncertainty. The experimental results, which include comparisons with state of the art evaluation measures, demonstrate the ambiguity extent and verify the validity of the conducted analysis.","PeriodicalId":262601,"journal":{"name":"2012 IEEE International Workshop on Machine Learning for Signal Processing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116355732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Minje Kim, P. Smaragdis, Glenn G. Ko, Rob A. Rutenbar
{"title":"Stereophonic spectrogram segmentation using Markov random fields","authors":"Minje Kim, P. Smaragdis, Glenn G. Ko, Rob A. Rutenbar","doi":"10.1109/MLSP.2012.6349754","DOIUrl":"https://doi.org/10.1109/MLSP.2012.6349754","url":null,"abstract":"There is a good amount of similarity between source separation approaches that use spectrograms captured from multiple microphones and computer vision algorithms that use multiple images for segmentation problems. Just as one would use Markov random fields (MRF) to solve image segmentation problems, we propose a method of modeling source separation using MRFs, and then solving such problems via common MRF inference methods. To this end, as a preprocessing, we convert stereophonic spectrograms into a integrated form based on their inter-channel level differences (ILD), which is a procedure analogous to getting a disparity map from stereo images for matching problems. Given the ILD matrix as an observed image, we estimate latent labels which stand for the responsibility of each spectrogram's time/frequency bin to a specific sound source. It is shown that the proposed method shows reasonable separation performance in a variety of mixing environments including online separation and moving sources. We expect this new way of formulating source separation problems to help exploit advantages of probabilistic graphical models and the recent advances in low-power, high-performance hardware suited for such tasks.","PeriodicalId":262601,"journal":{"name":"2012 IEEE International Workshop on Machine Learning for Signal Processing","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114808551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonnegative matrix factorization based self-taught learning with application to music genre classification","authors":"K. Markov, T. Matsui","doi":"10.1109/MLSP.2012.6349719","DOIUrl":"https://doi.org/10.1109/MLSP.2012.6349719","url":null,"abstract":"Availability of large amounts of raw unlabeled data has sparked the recent surge in semi-supervised learning research. In most works, however, it is assumed that labeled and unlabeled data come from the same distribution. This restriction is removed in the self-taught learning approach where unlabeled data can be different, but nevertheless have similar structure. First, a representation is learned from the unlabeled data via non-negative matrix factorization (NMF) and then it is applied to the labeled data used for classification. In this work, we implemented this method for the music genre classification task using two different databases: one as unlabeled data pool and the other for supervised classifier training. Music pieces come from 10 and 6 genres for each database respectively, while only one genre is common for both of them. Results from wide variety of experimental settings show that the self-taught learning method improves the classification rate when the amount of labeled data is small and, more interestingly, that consistent improvement can be achieved for a wide range of unlabeled data sizes.","PeriodicalId":262601,"journal":{"name":"2012 IEEE International Workshop on Machine Learning for Signal Processing","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115181860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. G. Silva, Everton Z. Nadalin, R. Attux, J. Filho
{"title":"A modified version of the MEXICO algorithm for performing ICA over Galois fields","authors":"D. G. Silva, Everton Z. Nadalin, R. Attux, J. Filho","doi":"10.1109/MLSP.2012.6349741","DOIUrl":"https://doi.org/10.1109/MLSP.2012.6349741","url":null,"abstract":"The theory of ICA over finite fields, established in the last five years, gave rise to a corpus of different separation strategies, which includes an algorithm based on the pairwise comparison of mixtures, called MEXICO. In this work, we propose an alternative version of the MEXICO algorithm, with modifications that - as shown by the results obtained for a number of representative scenarios - lead to performance improvements in terms of the computational effort required to reach a certain performance level, especially for an elevated number of sources. This parsimony can be relevant to enhance the applicability of the new ICA theory to data mining in the context of large discrete-valued databases.","PeriodicalId":262601,"journal":{"name":"2012 IEEE International Workshop on Machine Learning for Signal Processing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123758471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel adaptive Nyström approximation","authors":"Lingyan Sheng, Antonio Ortega","doi":"10.1109/MLSP.2012.6349777","DOIUrl":"https://doi.org/10.1109/MLSP.2012.6349777","url":null,"abstract":"We propose a novel perspective on the Nyström approximation method. Sampling the columns of the kernel matrix can be interpreted as projecting the data onto the subspace spanned by the corresponding columns. Thus, the quality of Nyström approximation can be quantified by the distance between the subspace spanned by the sampled columns and the subspace spanned by the data mapped to the eigenvectors corresponding to the top eigenvalues of the kernel matrix. Based on this interpretation, we design a novel adaptive Nyström approximation algorithm, BoostNyström. BoostNyström is efficient in terms of both time and space complexity. Experiments on benchmark data sets show that BoostNyström is more effective than the state-of-art algorithms.","PeriodicalId":262601,"journal":{"name":"2012 IEEE International Workshop on Machine Learning for Signal Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121299238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}