{"title":"A joint multi-scale convolutional network for fully automatic segmentation of the left ventricle","authors":"Qianqian Tong, Zhiyong Yuan, Xiangyun Liao, Mianlun Zheng, Weixu Zhu, Guian Zhang, Munan Ning","doi":"10.1109/ICIP.2017.8296855","DOIUrl":"https://doi.org/10.1109/ICIP.2017.8296855","url":null,"abstract":"Left ventricle (LV) segmentation is crucial for quantitative analysis of the cardiac contractile function. In this paper, we propose a joint multi-scale convolutional neural network to fully automatically segment the LV. Our method adopts two kinds of multi-scale features of cardiac magnetic resonance (CMR) images, including multi-scale features directly extracted from CMR images with different scales and multi-scale features constructed by intermediate layers of standard CNN architecture. We take advantage of these two strategies and fuse their prediction results to produce more accurate segmentation results. Qualitative results demonstrate the effectiveness and robustness of our method, and quantitative evaluation indicates our method achieves LV segmentation with higher accuracy than state-of-the-art approaches.","PeriodicalId":229602,"journal":{"name":"2017 IEEE International Conference on Image Processing (ICIP)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131479092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinwei Liu, Marius Pedersen, C. Charrier, Patrick A. H. Bours
{"title":"Can no-reference image quality metrics assess visible wavelength iris sample quality?","authors":"Xinwei Liu, Marius Pedersen, C. Charrier, Patrick A. H. Bours","doi":"10.1109/ICIP.2017.8296939","DOIUrl":"https://doi.org/10.1109/ICIP.2017.8296939","url":null,"abstract":"The overall performance of iris recognition systems is affected by the quality of acquired iris sample images. Due to the development of imaging technologies, visible wavelength iris recognition gained a lot of attention in the past few years. However, iris sample quality of unconstrained imaging conditions is a more challenging issue compared to the traditional near infrared iris biometrics. Therefore, measuring the quality of such iris images is essential in order to have good quality samples for iris recognition. In this paper, we investigate whether general purpose no-reference image quality metrics can assess visible wavelength iris sample quality.","PeriodicalId":229602,"journal":{"name":"2017 IEEE International Conference on Image Processing (ICIP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121433495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RGB-D data fusion in complex space","authors":"Ziyun Cai, Ling Shao","doi":"10.1109/ICIP.2017.8296625","DOIUrl":"https://doi.org/10.1109/ICIP.2017.8296625","url":null,"abstract":"Most of the RGB-D fusion methods extract features from RGB data and depth data separately and then simply concatenate them or encode these two kinds of features. Such frameworks cannot explore the correlation between the RGB pixels and their corresponding depth pixels. Motivated by the physical concept that range data correspond to the phase change and color information corresponds to the intensity, we first project raw RGB-D data into a complex space and then jointly extract features from the fused RGB-D images. Consequently, the correlated and individual parts of the RGB-D information in the new feature space are well combined. Experimental results of SIFT and fused images trained CNNs on two RGB-D datasets show that our proposed RGB-D fusion method can achieve competing performance against the classical fusion methods.","PeriodicalId":229602,"journal":{"name":"2017 IEEE International Conference on Image Processing (ICIP)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125401507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparing optical to digital metrics: What is the optimal defocus in a rotationally symmetric system?","authors":"J. Portilla, S. Barbero","doi":"10.1109/ICIP.2017.8296524","DOIUrl":"https://doi.org/10.1109/ICIP.2017.8296524","url":null,"abstract":"We address the problem of finding the optimal focus of an optical system with spherical aberration, under three optimization criteria: the classical optical root-mean-square second order moment minimization, the expected mean square error on the sensor, and the expected mean square error at the output of a Wiener restoration filter. We observe that these three criteria may behave very differently, and, particularly, the classical optical criterion typically provides very poor results in comparison. This has a direct impact on the design of new hybrid optical-digital imaging systems, which account for the image quality after an optics-aware embedded digital restoration stage. We present some very encouraging results for simulations under different noise and aberration levels.","PeriodicalId":229602,"journal":{"name":"2017 IEEE International Conference on Image Processing (ICIP)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125900029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pierre Marighetto, A. Coutrot, Nicolas Riche, N. Guyader, M. Mancas, B. Gosselin, R. Laganière
{"title":"Audio-visual attention: Eye-tracking dataset and analysis toolbox","authors":"Pierre Marighetto, A. Coutrot, Nicolas Riche, N. Guyader, M. Mancas, B. Gosselin, R. Laganière","doi":"10.1109/ICIP.2017.8296592","DOIUrl":"https://doi.org/10.1109/ICIP.2017.8296592","url":null,"abstract":"Although many visual attention models have been proposed, very few saliency models investigated the impact of audio information. To develop audio-visual attention models, researchers need to have a ground truth of eye movements recorded while exploring complex natural scenes in different audio conditions. They also need tools to compare eye movements and gaze patterns between these different audio conditions. This paper describes a toolbox that answer these needs by proposing a new eye-tracking dataset and its associated analysis ToolBox that contains common metrics to analysis eye movements. Our eye-tracking dataset contains the eye positions gathered during four eye-tracking experiments. A total of 176 observers were recorded while exploring 148 videos (mean duration = 22 s) split between different audio conditions (with or without sound) and visual categories (moving objects, landscapes and faces). Our ToolBox allows to visualize the temporal evolution of different metrics computed from the recorded eye positions. Both dataset and ToolBox are freely available to help design and assess visual saliency models for audiovisual dynamic stimuli.","PeriodicalId":229602,"journal":{"name":"2017 IEEE International Conference on Image Processing (ICIP)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130588779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel variational model for retinex in presence of severe noises","authors":"Lu Liu, Z. Pang, Y. Duan","doi":"10.1109/ICIP.2017.8296931","DOIUrl":"https://doi.org/10.1109/ICIP.2017.8296931","url":null,"abstract":"Retinex theory deals with compensation for illumination effects in images, which is usually an ill-posed problem. The existence of noises may severely challenge the performance of Retinex algorithms. Therefore, the main aim of this paper is to present a general variational Retinex model to effectively and robustly restore images corrupted by both noises and intensity inhomogeneities. Our strategy is to simultaneously recover the noise-free image and decompose it into reflectance and illumination component. The proposed model can be solved efficiently using the Alternating Direction Method of Multiplier (ADMM). Numerous experiments are conducted to demonstrate the advantages of the proposed model with Retinex illusions and medical image bias field correction for images in presence of Gaussian noise or impulsive noise.","PeriodicalId":229602,"journal":{"name":"2017 IEEE International Conference on Image Processing (ICIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129035974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Romain Vial, Hongyuan Zhu, Yonghong Tian, Shijian Lu
{"title":"Search video action proposal with recurrent and static YOLO","authors":"Romain Vial, Hongyuan Zhu, Yonghong Tian, Shijian Lu","doi":"10.1109/ICIP.2017.8296639","DOIUrl":"https://doi.org/10.1109/ICIP.2017.8296639","url":null,"abstract":"In this paper, we propose a new approach for searching action proposals in unconstrained videos. Our method first produces snippet action proposals by combining state-of-the-art YOLO detector (Static YOLO) and our regression based RNN detector (Recurrent YOLO). Then, these short action proposals are integrated to form final action proposals by solving two-pass dynamic programming which maximizes actioness score and temporal smoothness concurrently. Our experimental comparison with other state-of-the-arts on challenging UCF101 dataset shows that our method advances state-of-the-art proposal generation performance while maintaining low computational cost.","PeriodicalId":229602,"journal":{"name":"2017 IEEE International Conference on Image Processing (ICIP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134223846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shaun J. Canavan, Walter Keyes, Ryan Mccormick, Julie Kunnumpurath, Tanner Hoelzel, L. Yin
{"title":"Hand gesture recognition using a skeleton-based feature representation with a random regression forest","authors":"Shaun J. Canavan, Walter Keyes, Ryan Mccormick, Julie Kunnumpurath, Tanner Hoelzel, L. Yin","doi":"10.1109/ICIP.2017.8296705","DOIUrl":"https://doi.org/10.1109/ICIP.2017.8296705","url":null,"abstract":"In this paper, we propose a method for automatic hand gesture recognition using a random regression forest with a novel set of feature descriptors created from skeletal data acquired from the Leap Motion Controller. The efficacy of our proposed approach is evaluated on the publicly available University of Padova Microsoft Kinect and Leap Motion dataset, as well as 24 letters of the English alphabet in American Sign Language. The letters that are dynamic (e.g. j and z) are not evaluated. Using a random regression forest to classify the features we achieve 100% accuracy on the University of Padova Microsoft Kinect and Leap Motion dataset. We also constructed an in-house dataset using the 24 static letters of the English alphabet in ASL. A classification rate of 98.36% was achieved on this dataset. We also show that our proposed method outperforms the current state of the art on the University of Padova Microsoft Kinect and Leap Motion dataset.","PeriodicalId":229602,"journal":{"name":"2017 IEEE International Conference on Image Processing (ICIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116037837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shaun J. Canavan, Melanie Chen, Song Chen, Robert Valdez, Miles Yaeger, H. Lin, L. Yin
{"title":"Combining gaze and demographic feature descriptors for autism classification","authors":"Shaun J. Canavan, Melanie Chen, Song Chen, Robert Valdez, Miles Yaeger, H. Lin, L. Yin","doi":"10.1109/ICIP.2017.8296983","DOIUrl":"https://doi.org/10.1109/ICIP.2017.8296983","url":null,"abstract":"People with autism suffer from social challenges and communication difficulties, which may prevent them from leading a fruitful and enjoyable life. It is imperative to diagnose and start treatments for autism as early as possible and, in order to do so, accurate methods of identifying the disorder are vital. We propose a novel method for classifying autism through the use of eye gaze and demographic feature descriptors that include a subject's age and gender. We construct feature descriptors that incorporate the subject's age and gender, as well as features based on eye gaze data. Using eye gaze information from the National Database for Autism Research, we tested our constructed feature descriptors on three different classifiers; random regression forests, C4.5 decision tree, and PART. Our proposed method for classifying autism resulted in a top classification rate of 96.2%.","PeriodicalId":229602,"journal":{"name":"2017 IEEE International Conference on Image Processing (ICIP)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123152763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards 3D convolutional neural networks with meshes","authors":"Miguel Domínguez, F. Such, Shagan Sah, R. Ptucha","doi":"10.1109/ICIP.2017.8297019","DOIUrl":"https://doi.org/10.1109/ICIP.2017.8297019","url":null,"abstract":"Voxels are an effective approach to 3D mesh and point cloud classification because they build upon mature Convolutional Neural Network concepts. We show however that their cubic increase in dimensionality is unsuitable for more challenging problems such as object detection in a complex point cloud scene. We observe that 3D meshes are analogous to graph data and can thus be treated with graph signal processing techniques. We propose a Graph Convolutional Neural Network (Graph-CNN), which enables mesh data to be represented exactly (not approximately as with voxels) with quadratic growth as the number of vertices increases. We apply Graph-CNN to the ModelNet10 classification dataset and demonstrate improved results over a previous graph convolution method.","PeriodicalId":229602,"journal":{"name":"2017 IEEE International Conference on Image Processing (ICIP)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117100688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}