{"title":"Enhanced Steerable Pyramid Transformation for Medical Ultrasound Image Despeckling","authors":"Prerna Singh, R. Mukundan, Rex de Ryke","doi":"10.1109/MMSP.2018.8547091","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547091","url":null,"abstract":"The paper presents a novel approach for suppressing speckle noise at the same time preserving edge information effectively in ultrasound images for better clinical analysis and problem identification. The framework includes the modified adaptive Wiener filter (MAWF) along with the Canny edge detection method and enhanced steerable pyramid transformation (SPT) algorithm. The Canny algorithm is used to detect the true edges from the noisy ultrasound (US) image, and the MAWF algorithm smoothens the speckle effect without affecting the edge information which is preserved separately and added to the final output. The discrete Fourier transform (DFT) is used to extract the low and high frequency coefficients. Unlike other multiresolution techniques used for speckle suppression, the proposed method uses the steerable pyramid transformation technique based on high frequency components extracted using DFT for image enhancement. The coherence component extraction (CCE) method enhances the overall texture and edge features of the image even in the darker portions of the image. The output of this stage is finally combined with the stored edge information. This paper also presents experimental results to show that the proposed technique outperforms other state-of-art techniques in terms of peak signal to noise ratio, structural similarity index, and universal quality index.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124372209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rethinking Recurrent Latent Variable Model for Music Composition","authors":"E. Koh, S. Dubnov, Dustin Wright","doi":"10.1109/MMSP.2018.8547061","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547061","url":null,"abstract":"We present a model for capturing musical features and creating novel sequences of music, called the Convolutional-Variational Recurrent Neural Network. To generate sequential data, the model uses an encoder-decoder architecture with latent probabilistic connections to capture the hidden structure of music. Using the sequence-to-sequence model, our generative model can exploit samples from a prior distribution and generate a longer sequence of music. We compare the performance of our proposed model with other types of Neural Networks using the criteria of Information Rate that is implemented by Variable Markov Oracle, a method that allows statistical characterization of musical information dynamics and detection of motifs in a song. Our results suggest that the proposed model has a better statistical resemblance to the musical structure of the training data, which improves the creation of new sequences of music in the style of the originals.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122608688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea Migliorati, A. Fiandrotti, Gianluca Francini, S. Lepsøy, R. Leonardi
{"title":"Feature Fusion for Robust Patch Matching with Compact Binary Descriptors","authors":"Andrea Migliorati, A. Fiandrotti, Gianluca Francini, S. Lepsøy, R. Leonardi","doi":"10.1109/MMSP.2018.8547141","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547141","url":null,"abstract":"This work addresses the problem of learning compact yet discriminative patch descriptors within a deep learning framework. We observe that features extracted by convolutional layers in the pixel domain are largely complementary to features extracted in a transformed domain. We propose a convolutional network framework for learning binary patch descriptors where pixel domain features are fused with features extracted from the transformed domain. In our framework, while convolutional and transformed features are distinctly extracted, they are fused and provided to a single classifier which thus jointly operates on convolutional and transformed features. We experiment at matching patches from three different dataset, showing that our feature fusion approach outperforms multiple state-of-the-art approaches in terms of accuracy, rate and complexity.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"47 16","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131722308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chris Foster, Dhanush Dharmaretnam, Haoyan Xu, Alona Fyshe, G. Tzanetakis
{"title":"Decoding Music in the Human Brain Using EEG Data","authors":"Chris Foster, Dhanush Dharmaretnam, Haoyan Xu, Alona Fyshe, G. Tzanetakis","doi":"10.1109/MMSP.2018.8547051","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547051","url":null,"abstract":"Semantic vectors, or language embeddings, are used in computational linguistics to represent language for a variety of machine related tasks including translation, speech to text, and natural language understanding. These semantic vectors have also been extensively studied in correlation with human brain data, showing evidence that the representation of language in the human brain can be modeled through these vectors with high correlation. Further, various attempts have been made to study how the human brain represents and understands music. For example, it has been shown that EEG data of subjects listening to music can be used for tempo detection and singer gender recognition. We propose studying the relationship between the EEG data of subjects listening to audio and the audio feature vectors modeled after the semantic vectors in computational linguistics. This could provide new insight into how the brain processes and understands music.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122199551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Deep Convolutional Network Based Supervised Coarse-to-Fine Algorithm for Optical Flow Measurement","authors":"Meiyuan Fang, Yanghao Li, Yuxing Han, Jiangtao Wen","doi":"10.1109/MMSP.2018.8547130","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547130","url":null,"abstract":"The measurement of optical flow is an important problem in image processing. There are a number of methods available for optical flow estimation, including traditional variational methods, deep learning based supervised/unsupervised methods. In this work, we propose a deep convolutional network (CNN) based supervised coarse-to-fine approach, which is trained in end-to-end fashion. The proposed method is tested on standard optical flow benchmark datasets including Flying Chairs, MPI Sintel Clean and Final, KITTI. Experimental results show that the proposed framework is able to achieve comparable results to previous approaches with much smaller network architecture.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121859274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Robust HER2 Neural Network Classification Algorithm Using Biomarker-Specific Feature Descriptors","authors":"Prerna Singh, R. Mukundan","doi":"10.1109/MMSP.2018.8547043","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547043","url":null,"abstract":"Computer assisted evaluations of Whole Slide Images (WSI) of histopathological slides require robust biomarker-specific feature descriptors for accurate grading and classification. Considering the large amount of processing involved in analysing WSIs, training and classification, it is important to have an optimized set of features that closely represent the characteristics of the biomarkers used by pathologists in manual assessments. In this paper, we consider the problem of classifying WSIs of ImmunoHistoChemistry (IHC) stained slides for automated breast cancer grading. We use a combination of intensity and texture features derived from the input image at different saturation levels, and show its effectiveness in a Neural Network architecture for classifying the image into one of the four HER2 scores. The paper also presents three configurations for the neural network and gives comparative analysis showing the variations of classification accuracy with respect to changes in the configuration and the learning rate.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131172482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image Inpainting Detection Based on a Modified Formulation of Canonical Correlation Analysis","authors":"Xiao Jin, Yuting Su, Yongwei Wang, Z. J. Wang","doi":"10.1109/MMSP.2018.8547106","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547106","url":null,"abstract":"Image inpainting is a common image editing technique for filling the missing areas in images. It can be adopted to destroy the integrity of images by forgers with ulterior motives. Compared with other types of inpainting, sparsity-based inpainting assumes more general prior knowledge and is more widely used in practical applications. Although several methods for detecting exemplar-based and diffusion-based inpainting have been proposed, there is a shortage of effective scheme for detecting sparsity-based inpainting. In this paper, we proposed a novel algorithm for sparsity-based image inpainting detection. This type of inpainting has a strong effect on the coefficients of Canonical Correlation Analysis (CCA). Based on this observation, a modified objective function of CCA and a corresponding optimization algorithm are further developed to enhance the difference of inter-class in our feature set. The experiments implemented on two publicly available datasets demonstrated our method's superiority over other competitors. Particularly, unlike previous inpainting detection methods, the proposed framework has better performance in the case of JPEG compression.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"167 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123874876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lulu Sun, Yongbing Zhang, Xingzheng Wang, Haoqian Wang, Qionghai Dai
{"title":"Fast, Robust, and Accurate Image Denoising via Very Deeply Cascaded Residual Networks","authors":"Lulu Sun, Yongbing Zhang, Xingzheng Wang, Haoqian Wang, Qionghai Dai","doi":"10.1109/MMSP.2018.8547119","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547119","url":null,"abstract":"Patch based image modelings have shown great potential in image denoising. They mainly exploit the nonlocal self-similarity (NSS) of either input degraded images or clean natural ones when training models, while failing to learn the mappings between them. More seriously, these algorithms have very high time complexity and poor robustness when handling images with different noise variances and resolutions. To address these problems, in this paper, we propose very deeply cascaded residual networks (VDCRN) to build the precise relationships between the noisy images and their corresponding noise-free ones. It adopts a new residual unit with an identity skip connection (shortcut) to make training easy and improve generalization. The introduction of shortcut is helpful to avoid the problem of gradient vanishing and preserve more image details. By cascading three such residual units, we build the VDCRN to deploy deeper and larger convolutional networks. Based on such a residual network, our VDCRN achieves very fast speed and good robustness. Experimental results demonstrate that our model outperforms a lot of state-of-the-art denoising algorithms quantitively and qualitively.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"516 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116540603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CPNet: A Context Preserver Convolutional Neural Network for Detecting Shadows in Single RGB Images","authors":"S. Mohajerani, Parvaneh Saeedi","doi":"10.1109/MMSP.2018.8547080","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547080","url":null,"abstract":"Automatic detection of shadow regions in an image is a difficult task due to the lack of prior information about the illumination source and the dynamic of the scene objects. To address this problem, in this paper, a deep-learning based segmentation method is proposed that identifies shadow regions at the pixel-level in a single RGB image. We exploit a novel Convolutional Neural Network (CNN) architecture to identify and extract shadow features in an end-to-end manner. This network preserves learned contexts during the training and observes the entire image to detect global and local shadow patterns simultaneously. The proposed method is evaluated on two publicly available datasets of SBU and UCF. We have improved the state-of-the-art Balanced Error Rate (BER) on these datasets by 22 % and 14 %, respectively.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128987156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wallace Bruno S. de Souza, B. Macchiavello, Eduardo Peixoto, E. Hung, Gene Cheung
{"title":"A Sub-Aperture Image Selection Refinement Method for Progressive Light Field Transmission","authors":"Wallace Bruno S. de Souza, B. Macchiavello, Eduardo Peixoto, E. Hung, Gene Cheung","doi":"10.1109/MMSP.2018.8547069","DOIUrl":"https://doi.org/10.1109/MMSP.2018.8547069","url":null,"abstract":"Light field cameras capture the emanated light from a scene. This type of images allows for changing point of views or focal points by processing the captured information. Recently, a Progressive Light Field Communication (PLFC) was proposed. PLFC addresses an interactive Light Field (LF) streaming framework, where a client requests a certain view or focal point and a server synthesizes and transmits each requested image as a linear combination of Sub-Aperture Images (SAI). The main idea of PLFC is that as the virtual views are transmitted, the client gradually learns information about the LF, so eventually the client may posses enough information to locally create the virtual view at the required quality, avoiding the transmission of a new image. In order to PLFC work, an optimization algorithm which selects the SAIs that are used to create a certain virtual view is requested. Here, we improve over the previous PLFC proposal by presenting a method that focuses on a refinement algorithm for SAI selection, using dynamic Quantization Parameter (QP) during encoding, using an automatic method to determine the Lagrangian multiplier during optimization and modifying how the initial required cache is created. These proposed changes in the algorithm produce significant gains. The results shows gains up to 85.8% on BD-rate compared to trivial LF transmissions, whereas they're up to 32.8% compared to previous PLFC.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"166 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123178525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}