{"title":"Quality assessment of monocular 3D inference","authors":"Jorge Hernández","doi":"10.1109/ICIP.2016.7532508","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532508","url":null,"abstract":"Recently proliferation of 3D inference methods shows an important alternative to perceive in 3D of real world from single images. The quality evaluation of 3D estimated from inference methods has been demonstrated using dataset with 3D ground truth data. However in real scenarios, the 3D inference quality is complete unknown. In this work, we present a new quality assessment of 3D monocular inference. First, we define the notion of quality index for 3D inference data. Then, we present a weighted linear model of similarity metrics to estimate quality index. The method is based on hand crafted similarity measures among image representations of RGB image and 3D inferred data. We demonstrate the effectiveness of our proposed method using public datasets and 3D inference methods of state of the art.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"2016 1","pages":"1002-1006"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86310751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Redundant frame structure using M-frame for interactive light field streaming","authors":"B. Motz, Gene Cheung, Antonio Ortega","doi":"10.1109/ICIP.2016.7532582","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532582","url":null,"abstract":"A light field (LF) is a 2D array of closely spaced viewpoint images of a static 3D scene. In an interactive LF streaming (ILFS) scenario, a user successively requests desired neighboring viewpoints for observation, and in response the server must transmit pre-encoded data for correct decoding of the requested viewpoint images. Designing frame structures for ILFS is challenging, since at encoding time it is not known what navigation path a user will take, making differential coding very difficult to employ. In this paper, leveraging on a recent work on the merge operator - a new distributed source coding technique that efficiently merges differences among a set of side information (SI) frames into an identical reconstruction - we design redundant frame structures that facilitate ILFS, trading off expected transmission cost with total storage size. Specifically, we first propose a new view interaction model that captures view navigation tendencies of typical users. Assuming a flexible one-frame buffer at the decoder, we then derive a set of recursive equations that compute the expected transmission cost for a navigation lifetime of T views, given the proposed interaction model and a pre-encoded frame structure. Finally, we propose an algorithm that greedily builds a redundant frame structure, minimizing a weighted sum of expected transmission cost and total storage size. Experimental results show that our proposed algorithm generates frame structures with better transmission / storage tradeoffs than competing schemes.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"40 1","pages":"1369-1373"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87330787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spectral slopes for automated classification of land cover in landsat images","authors":"S. M. Aswatha, J. Mukhopadhyay, P. Biswas","doi":"10.1109/ICIP.2016.7533182","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7533182","url":null,"abstract":"In the literature, various techniques for supervised/ semi-supervised classification of satellite imageries require manual selection of samples for each class. In this paper, we propose a spectral-slope based classification technique, which automates the process of initial labeling of a set of sample points. These are subsequently used in a supervised classifier as training samples and it performs the task of classification over all the pixels in the image. We demonstrate the effectiveness of our proposed classification technique in summarizing the changes in temporal image sets. For selecting the training samples from the satellite imageries, a set of rules is proposed by using the spectral-slope properties. We classify the land-cover into three classes, namely, water, vegetation, and vegetation-void, and validate the classification results using very high resolution satellite imagery. The approach has also been used in the analysis of images acquired by different sensors operating under similar wavelength ranges.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"131 1","pages":"4354-4358"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89115505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning a perceptual manifold for image set classification","authors":"Sriram Kumar, A. Savakis","doi":"10.1109/ICIP.2016.7533198","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7533198","url":null,"abstract":"We present a biologically motivated manifold learning framework for image set classification inspired by Independent Component Analysis for Grassmann manifolds. A Grassmann manifold is a collection of linear subspaces, such that each subspace is mapped on a single point on the manifold. We propose constructing Grassmann subspaces using Independent Component Analysis for robustness and improved class separation. The independent components capture spatially local information similar to Gabor-like filters within each subspace resulting in better classification accuracy. We further utilize linear discriminant analysis or sparse representation classification on the Grassmann manifold to achieve robust classification performance. We demonstrate the efficacy of our approach for image set classification on face and object recognition datasets.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"126 1","pages":"4433-4437"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88995865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Laplacian-guided image decolorization","authors":"Cosmin Ancuti, C. Ancuti","doi":"10.1109/ICIP.2016.7533132","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7533132","url":null,"abstract":"In this paper we introduce a novel decolorization strategy built on image fusion principles. Decolorization (color-to-grayscale), is an important transformation used in many monochrome image processing applications. We demonstrate that aside from color spatial distribution, local information plays an important role in maintaining the discriminability of the image conversion. Our strategy blends the three color channels R, G, B guided by two weight maps that filter the local transitions and measure the dominant values of the regions using the Laplacian information. In order to minimize artifacts introduced by the weight maps, our fusion approach is designed in a multi-scale fashion, using a Laplacian pyramid decomposition. Additionally, compared with most of the existing techniques our straightforward technique has the advantage to be computationally effective. We demonstrate that our technique is temporal coherent being suitable to decolorize videos. A comprehensive qualitative and also quantitative evaluation based on an objective visual descriptor demonstrates the utility of our decolorization technique.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"13 1","pages":"4107-4111"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81017882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Islam Reda, A. Shalaby, F. Khalifa, M. Elmogy, A. Aboulfotouh, M. El-Ghar, Ehsan Hosseini-Asl, N. Werghi, R. Keynton, A. El-Baz
{"title":"Computer-aided diagnostic tool for early detection of prostate cancer","authors":"Islam Reda, A. Shalaby, F. Khalifa, M. Elmogy, A. Aboulfotouh, M. El-Ghar, Ehsan Hosseini-Asl, N. Werghi, R. Keynton, A. El-Baz","doi":"10.1109/ICIP.2016.7532843","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532843","url":null,"abstract":"In this paper, we propose a novel non-invasive framework for the early diagnosis of prostate cancer from diffusion-weighted magnetic resonance imaging (DW-MRI). The proposed approach consists of three main steps. In the first step, the prostate is localized and segmented based on a new level-set model. In the second step, the apparent diffusion coefficient (ADC) of the segmented prostate volume is mathematically calculated for different b-values. To preserve continuity, the calculated ADC values are normalized and refined using a Generalized Gauss-Markov Random Field (GGMRF) image model. The cumulative distribution function (CDF) of refined ADC for the prostate tissues at different b-values are then constructed. These CDFs are considered as global features describing water diffusion which can be used to distinguish between benign and malignant tumors. Finally, a deep learning auto-encoder network, trained by a stacked non-negativity constraint algorithm (SNCAE), is used to classify the prostate tumor as benign or malignant based on the CDFs extracted from the previous step. Preliminary experiments on 53 clinical DW-MRI data sets resulted in 100% correct classification, indicating the high accuracy of the proposed framework and holding promise of the proposed CAD system as a reliable non-invasive diagnostic tool.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"8 1","pages":"2668-2672"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73946526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic character labeling for camera captured document images","authors":"Wei-liang Fan, K. Kise, M. Iwamura","doi":"10.1109/ICIP.2016.7532967","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532967","url":null,"abstract":"Character groundtruth for camera captured documents is crucial for training and evaluating advanced OCR algorithms. Manually generating character level groundtruth is a time consuming and costly process. This paper proposes a robust groundtruth generation method based on document retrieval and image registration for camera captured documents. We use an elastic non-rigid alignment method to fit the captured document image which relaxes the flat paper assumption made by conventional solutions. The proposed method allows building very large scale labeled camera captured documents dataset, without any human intervention. We construct a large labeled dataset consisting of 1 million camera captured Chinese character images. Evaluation of samples generated by our approach showed that 99.99% of the images were correctly labeled, even with different distortions specific to cameras such as blur, specularity and perspective distortion.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"79 1","pages":"3284-3288"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87116759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kuan-Chuan Peng, Amir Sadovnik, Andrew C. Gallagher, Tsuhan Chen
{"title":"Where do emotions come from? Predicting the Emotion Stimuli Map","authors":"Kuan-Chuan Peng, Amir Sadovnik, Andrew C. Gallagher, Tsuhan Chen","doi":"10.1109/ICIP.2016.7532430","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532430","url":null,"abstract":"Which parts of an image evoke emotions in an observer? To answer this question, we introduce a novel problem in computer vision - predicting an Emotion Stimuli Map (ESM), which describes pixel-wise contribution to evoked emotions. Building a new image database, EmotionROI, as a benchmark for predicting the ESM, we find that the regions selected by saliency and objectness detection do not correctly predict the image regions which evoke emotion. Although objects represent important regions for evoking emotion, parts of the background are also important. Based on this fact, we propose using fully convolutional networks for predicting the ESM. Both qualitative and quantitative experimental results confirm that our method can predict the regions which evoke emotion better than both saliency and objectness detection.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"569 1","pages":"614-618"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87252616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Yang, F. Chen, Xiaoming Chen, Yan Dai, Zhenyang Chen, Jiang Ji, Tong Zhao
{"title":"Video system for human attribute analysis using compact convolutional neural network","authors":"Yi Yang, F. Chen, Xiaoming Chen, Yan Dai, Zhenyang Chen, Jiang Ji, Tong Zhao","doi":"10.1109/ICIP.2016.7532424","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532424","url":null,"abstract":"Convolutional neural networks show their advantage in human attribute analysis (e.g. age, gender and ethnicity). However, they experience issues (e.g. robustness and responsiveness) when deployed in an intelligent video system. We propose one compact CNN model and apply it in our video system motivated by the full consideration of performance and usability. With the proposed web image mining and labelling strategy, we construct a large training set which covers various image conditions. The proposed CNN model successfully achieves a mean absolute error (MAE) of 3.23 years on the Morph 2 dataset, using the same test policy as our counterparts. This is the state-of-the-art score to our knowledge using CNN for age estimation. The proposed video analysis system employs this compact CNN model and demonstrated good performance in both dataset tests and deployment in real-world environments.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"53 1","pages":"584-588"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90282199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A weighted total variation approach for the atlas-based reconstruction of brain MR data","authors":"Mingli Zhang, Kuldeep Kumar, Christian Desrosiers","doi":"10.1109/ICIP.2016.7533177","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7533177","url":null,"abstract":"Compressed sensing is a powerful approach to reconstruct high-quality images using a small number of samples. This paper presents a novel compressed sensing method that uses a probabilistic atlas to impose spatial constraints on the reconstruction of brain magnetic resonance imaging (MRI) data. A weighted total variation (TV) model is proposed to characterize the spatial distribution of gradients in the brain, and incorporate this information in the reconstruction process. Experiments on T1-weighted MR images from the ABIDE dataset show our proposed method to outperform the standard uniform TV model, as well as state-of-the-art approaches, for low sampling rates and high noise levels.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"18 1","pages":"4329-4333"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89001823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}