{"title":"Learning graph fusion for query and database specific image retrieval","authors":"Chih-Kuan Yeh, Wei-Chieh Wu, Y. Wang","doi":"10.1109/MMSP.2016.7813337","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813337","url":null,"abstract":"In this paper, we propose a graph-based image retrieval algorithm via query and database specific feature fusion. While existing feature fusion approaches exist for image retrieval, they typically do not consider the image database of interest (i.e., to be retrieved) for observing the associated feature contributions. In the offline learning stage, our proposed method first identifies representative features for describing images to be retrieved. Given a query input, we further exploit and integrate its visual information and utilize graph-based fusion for performing query-database specific retrieval. In our experiments, we show that our proposed method achieves promising performance on the benchmark database of UKbench, and performs favorably against recent fusion-based image retrieval approaches.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115367069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust MRI reconstruction via re-weighted total variation and non-local sparse regression","authors":"Mingli Zhang, Christian Desrosiers","doi":"10.1109/MMSP.2016.7813392","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813392","url":null,"abstract":"Total variation (TV) based sparsity and non local self-similarity have been shown to be powerful tools for the reconstruction of magnetic resonance (MR) images. However, due to the uniform regularization of gradient sparsity, standard TV approaches often over-smooth edges in the image, resulting in the loss of important details. This paper presents a novel compressed sensing method for the reconstruction of MRI data, which uses a regularization strategy based on re-weighted TV to preserve image edges. This method also leverages the redundancy of non local image patches through the use of a sparse regression model. An efficient strategy based on the Alternating Direction Method of Multipliers (ADMM) algorithm is used to recover images with the proposed model. Experimental results on a simulated phantom and real brain MR data show our method to outperform state-of-the-art compressed sensing approaches, by better preserving edges and removing artifacts in the image.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129910202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Global anomaly detection in crowded scenes based on optical flow saliency","authors":"Ang Li, Z. Miao, Yigang Cen","doi":"10.1109/MMSP.2016.7813390","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813390","url":null,"abstract":"In this paper, an algorithm of global anomaly detection in crowded scenes using the saliency in optical flow field is proposed. Before the process of extracting the histogram of maximal optical flow projection (HMOFP), the scale invariant feature transforms (SIFT) method is utilized to get the saliency map of optical flow field. On the basis of the HMOFP feature of normal frames, the online dictionary learning algorithm is used to train an optimal dictionary with proper redundancy after a process of selecting the training samples, which is better than the dictionary simply composed by the HMOFP feature of the whole training frames. In order to detect whether a frame is normal or not, we use the ℓ1-norm of the sparse reconstruction coefficients (i.e., the sparse reconstruction cost, SRC) to show the anomaly of the testing frame, which is simple but very effective. The experiment results on UMN dataset and the comparison to the state-of-the-art methods show that our algorithm is promising.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"189 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133747613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient imaging through scattering media by random sampling","authors":"Yifu Hu, Xin Jin, Qionghai Dai","doi":"10.1109/MMSP.2016.7813348","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813348","url":null,"abstract":"Imaging through scattering media is a tough task in computational imaging. A recent breakthrough technique based on speckle scanning was proposed with outstanding imaging performance. However, to achieve high imaging quality, dense sampling of the integrated intensity matrix is needed, which leads to a time-consuming scanning process. In this paper, we propose a method that exploits spatial redundancy of the integrated intensity matrix and reconstructs the complete matrix from few random samples. A reconstruction model that jointly penalizes total variation and weighted sum of nuclear norm of local patches is built with improved reconstruction quality. Experiments are performed to verify the effectiveness of the proposed method and results demonstrate that the proposed method can achieve a same imaging quality with 80% reduction of the data acquisition complexity.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130677950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using cardio-respiratory signals to recognize emotions elicited by watching music video clips","authors":"Leila Mirmohamadsadeghi, A. Yazdani, J. Vesin","doi":"10.1109/MMSP.2016.7813349","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813349","url":null,"abstract":"The automatic recognition of human emotions from physiological signals is of increasing interest in many applications. Images with high emotional content have been shown to alter signals such as the electrocardiogram (ECG) and the respiration among many other physiological recordings. However, recognizing emotions from multimedia stimuli, such as music video clips, which are growing in numbers in the digital world and are the medium of many recommendation systems, has not been adequately investigated. This study aims to investigate the recognition of emotions elicited by watching music video clips, from features extracted from the ECG, the respiration and several synchronization aspects of the two. On a public dataset, we achieved higher classification rates than the state-of-the-art using either the ECG or the respiration signals alone. A feature related to the synchronization of the two signals achieved even better performance.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117213543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient detail-enhanced exposure correction based on auto-fusion for LDR image","authors":"Jiayi Chen, Xuguang Lan, Meng Yang","doi":"10.1109/MMSP.2016.7813345","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813345","url":null,"abstract":"We consider the problem of how to simultaneously and well correct the over- and under-exposure regions in a single low dynamic range (LDR) image. Recent methods typically focus on global visual quality but cannot well-correct much potential details in extremely wrong exposure areas, and some are also time consuming. In this paper, we propose a fast and detail-enhanced correction method based on automatic fusion which combines a pair of complementarily corrected images, i.e. backlight & highlight correction images (BCI &HCI). A BCI with higher visual quality in details is quickly produced based on a proposed faster multi-scale retinex algorithm; meanwhile, a HCI is generated through contrast enhancement method. Then, an automatic fusion algorithm is proposed to create a color-protected exposure mask for fusing BCI and HCI when avoiding potential artifacts on the boundary. The experiment results show that the proposed method can fast correct over/under-exposed regions with higher detail quality than existing methods.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"256 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132044367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benoit Boyadjis, Cyril Bergeron, B. Pesquet-Popescu, F. Dufaux
{"title":"Background simplification for ROI-oriented low bitrate video coding","authors":"Benoit Boyadjis, Cyril Bergeron, B. Pesquet-Popescu, F. Dufaux","doi":"10.1109/MMSP.2016.7813365","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813365","url":null,"abstract":"Low-bitrate video compression is a challenging task, particularly with the increasing complexity of video sequences. Re-shaping video data before its compression with modern hybrid encoders has provided interesting results in the low and ultra-low bit rate domains. In this work, we propose a novel saliency guided preprocessing approach, which combines adaptive re-sampling and background texture removal, to achieve efficient ROI-oriented compression. Evaluated with HEVC, we show that our solution improves the ROI encoding over a wide range of resolutions and bit rates whilst maintaining a high background intelligibility level.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"229 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124530600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Naveen Kumar, T. Guha, Che-Wei Huang, Colin Vaz, Shrikanth S. Narayanan
{"title":"Novel affective features for multiscale prediction of emotion in music","authors":"Naveen Kumar, T. Guha, Che-Wei Huang, Colin Vaz, Shrikanth S. Narayanan","doi":"10.1109/MMSP.2016.7813377","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813377","url":null,"abstract":"The majority of computational work on emotion in music concentrates on developing machine learning methodologies to build new, more accurate prediction systems, and usually relies on generic acoustic features. Relatively less effort has been put to the development and analysis of features that are particularly suited for the task. The contribution of this paper is twofold. First, the paper proposes two features that can efficiently capture the emotion-related properties in music. These features are named compressibility and sparse spectral components. These features are designed to capture the overall affective characteristics of music (global features). We demonstrate that they can predict emotional dimensions (arousal and valence) with high accuracy as compared to generic audio features. Secondly, we investigate the relationship between the proposed features and the dynamic variation in the emotion ratings. To this end, we propose a novel Haar transform-based technique to predict dynamic emotion ratings using only global features.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127773559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A simple approach towards efficient partial-copy video detection","authors":"Zobeida J. Guzman-Zavaleta, C. F. Uribe","doi":"10.1109/MMSP.2016.7813396","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813396","url":null,"abstract":"Video copy detection is still an open problem as current approaches are not able to carry out the detection with enough efficacy and efficiency. These are desirable features in modern video-based applications requiring real-time processing in large scale video databases and without compromising detection performance, especially when facing non-simulated video attacks. These characteristics are also desirable in partial-copy detection, where the detection challenges increase when the video query contains short segments corresponding to a copied video, this is, partial-copies at frame level. Motivated by these issues, in this work we propose a video fingerprinting approach based on the extraction of a set of low-cost and independent binary global and local fingerprints. We tested our approach with a video dataset of real-copies and the results show that our method outperforms robust state-of-the-art methods in terms of detection scores and computational efficiency. The latter is achieved by processing only short segments of 1 second length, which takes a processing time of 44 ms.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114593480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the applicability of the SBC codec to support super-wideband speech in Bluetooth handsfree communications","authors":"Nathan Souviraà-Labastie, S. Ragot","doi":"10.1109/MMSP.2016.7813378","DOIUrl":"https://doi.org/10.1109/MMSP.2016.7813378","url":null,"abstract":"With the recent standardization of the Enhanced Voice Services (EVS) codec in 3GPP, mobile operators can upgrade their voice services to offer super-wideband (SWB) audio quality (with 32 kHz sampling rate). There is however one important use case which is currently limited by existing standards: hands free communication with wireless headsets, car kits, or connected audio devices often rely on Bluetooth, and the hands free-profile (HFP) in Bluetooth is currently limited to narrowband and wideband speech. Following the approach used to extend HFP to support wideband, we study in this paper the applicability of the SBC codec to further extend HFP to SWB. An evaluation of performance is provided taking into account Bluetooth system constraints.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129936654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}