{"title":"Towards skin image mosaicing","authors":"Khuram Faraz, W. Blondel, M. Amouroux, C. Daul","doi":"10.1109/IPTA.2016.7821014","DOIUrl":"https://doi.org/10.1109/IPTA.2016.7821014","url":null,"abstract":"This paper presents a framework for mosaicing high resolution skin video sequences in the context of teleder-matology. While considering different stages of the mosaicing pipeline, including stitching and blending, several feature- and intensity-based image registration approaches are compared. Their performances in terms of quantitative and qualitative results are discussed so as to move towards the selection of the most suited approach. Although the intensity based approach proved to be more precise over short displacements, the feature based approach is advantageous in terms of computation time apart from being more reliable over large displacements, thus permitting a faster mosaic construction by skipping over some frames in the sequence.","PeriodicalId":123429,"journal":{"name":"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121552096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohamed Soliman, H. R. Tavakoli, Jorma T. Laaksonen
{"title":"Towards gaze-based video annotation","authors":"Mohamed Soliman, H. R. Tavakoli, Jorma T. Laaksonen","doi":"10.1109/IPTA.2016.7821028","DOIUrl":"https://doi.org/10.1109/IPTA.2016.7821028","url":null,"abstract":"This paper presents our efforts towards a framework for video annotation using gaze. In computer vision, video annotation (VA) is an essential step in providing a ground truth for the evaluation of object detection and tracking techniques. VA is a demanding element in the development of video processing algorithms, where each object of interest should be manually labelled. Although the community has handled VA for a long time, the size of new data sets and the complexity of the new tasks pushes us to revisit it. A barrier towards automated video annotation is the recognition of the object of interest and tracking it over image sequences. To tackle this problem, we employ the concept of visual attention for enhancing video annotation. In an image, human attention naturally grasps interesting areas that provide valuable information for extracting the objects of interest, which can be exploited to annotate videos. Under task-based gaze recording, we utilize an observer's gaze to filter seed object detector responses in a video sequence. The filtered boxes are then passed to an appearance-based tracking algorithm. We evaluate the gaze usefulness by comparing the algorithm with gaze and without it. We show that eye gaze is an influential cue for enhancing the automated video annotation, improving the annotation significantly.","PeriodicalId":123429,"journal":{"name":"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116845148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reconstruction of retinal spectra from RGB data using a RBF network","authors":"U. Nguyen, L. Laaksonen, H. Uusitalo, L. Lensu","doi":"10.1109/IPTA.2016.7820973","DOIUrl":"https://doi.org/10.1109/IPTA.2016.7820973","url":null,"abstract":"In comparison with the standard three-channel colour images, spectral retinal images provide more detailed information about the structure of the retina. However, the availability of spectral retinal images for the research and development of image analysis methods is limited. In this paper, we propose two approaches to reconstruct spectral retinal images based on common RGB images. The approaches make use of fuzzy c-means clustering to perform quantization of the image data, and the radial basis function network to learn the mapping from the three-component color representation to the spectral space. The dissimilarities between the reconstructed spectral images and the original ones are evaluated on a retinal image set with spectral and RGB images, and by using a standard spectral quality metric. The experimental results show that the proposed approaches are able to reconstruct spectral retinal images with a relatively high accuracy.","PeriodicalId":123429,"journal":{"name":"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115226451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rabia Rauf, A. R. Shahid, Sheikh Ziauddin, A. Safi
{"title":"Pedestrian detection using HOG, LUV and optical flow as features with AdaBoost as classifier","authors":"Rabia Rauf, A. R. Shahid, Sheikh Ziauddin, A. Safi","doi":"10.1109/IPTA.2016.7821024","DOIUrl":"https://doi.org/10.1109/IPTA.2016.7821024","url":null,"abstract":"Pedestrian detection has been used in applications such as car safety, video surveillance, and intelligent vehicles. In this paper, we present a pedestrian detection scheme using HOG, LUV and optical flow features with AdaBoost Decision Stump classifier. Our experiments on Caltech-USA pedestrian dataset show that the proposed scheme achieves promising results of about 16.7% log-average miss rate.","PeriodicalId":123429,"journal":{"name":"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124071647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Rasti, O. Orlova, G. Tamberg, C. Ozcinar, Kamal Nasrollahi, T. Moeslund, G. Anbarjafari
{"title":"Improved interpolation kernels for super resolution algorithms","authors":"P. Rasti, O. Orlova, G. Tamberg, C. Ozcinar, Kamal Nasrollahi, T. Moeslund, G. Anbarjafari","doi":"10.1109/IPTA.2016.7820980","DOIUrl":"https://doi.org/10.1109/IPTA.2016.7820980","url":null,"abstract":"Super resolution (SR) algorithms are widely used in forensics investigations to enhance the resolution of images captured by surveillance cameras. Such algorithms usually use a common interpolation algorithm to generate an initial guess for the desired high resolution (HR) image. This initial guess is usually tuned through different methods, like learning-based or fusion-based methods, to converge the initial guess towards the desired HR output. In this work, it is shown that SR algorithms can result in better performance if more sophisticated kernels than the simple conventional ones are used for producing the initial guess. The contribution of this work is to introduce such a set of kernels which can be used in the context of SR. The quantitative and qualitative results on many natural, facial and iris images show the superiority of the generated HR images over two state-of-the-art SR algorithms when their original interpolation kernel is replaced by the ones introduced in this work.","PeriodicalId":123429,"journal":{"name":"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"37 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130863981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Phylogeny of JPEG images by ancestor estimation using missing markers on image pairs","authors":"Noe Le Philippe, W. Puech, C. Fiorio","doi":"10.1109/IPTA.2016.7820992","DOIUrl":"https://doi.org/10.1109/IPTA.2016.7820992","url":null,"abstract":"Nowadays it is extremely easy to tamper with images and share them thanks to social media. Identifying the transformation history is imperative to be able to trust these images. We address this problem by using image phylogeny trees, where the root is the image that has been less tampered with and as every generation is obtained from the transformation of its parents, the leaves are the most transformed images. Our method for image phylogeny trees reconstruction is based on a binary decision between two images using JPEG compression artifacts. Experimental results show that when there is no missing image data, the reconstruction is very accurate.","PeriodicalId":123429,"journal":{"name":"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133663854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jie Chen, Zhuoqing Chang, Qiang Qiu, Xiaobai Li, G. Sapiro, A. Bronstein, M. Pietikäinen
{"title":"RealSense = real heart rate: Illumination invariant heart rate estimation from videos","authors":"Jie Chen, Zhuoqing Chang, Qiang Qiu, Xiaobai Li, G. Sapiro, A. Bronstein, M. Pietikäinen","doi":"10.1109/IPTA.2016.7820970","DOIUrl":"https://doi.org/10.1109/IPTA.2016.7820970","url":null,"abstract":"Recent studies validated the feasibility of estimating heart rate from human faces in RGB video. However, test subjects are often recorded under controlled conditions, as illumination variations significantly affect the RGB-based heart rate estimation accuracy. Intel newly-announced low-cost RealSense 3D (RGBD) camera is becoming ubiquitous in laptops and mobile devices starting this year, opening the door to new and more robust computer vision. RealSense cameras produce RGB images with extra depth information inferred from a latent near-infrared (NIR) channel. In this paper, we experimentally demonstrate, for the first time, that heart rate can be reliably estimated from RealSense near-infrared images. This enables illumination invariant heart rate estimation, extending the heart rate from video feasibility to low-light applications, such as night driving. With the (coming) ubiquitous presence of RealSense devices, the proposed method not only utilizes its near-infrared channel, designed originally to be hidden from consumers; but also exploits the associated depth information for improved robustness to head pose.","PeriodicalId":123429,"journal":{"name":"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134245043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adria A. Sanguesa, Nicolai Krogh Jørgensen, Christian A. Larsen, Kamal Nasrollahi, T. Moeslund
{"title":"Initiating GrabCut by color difference for automatic foreground extraction of passport imagery","authors":"Adria A. Sanguesa, Nicolai Krogh Jørgensen, Christian A. Larsen, Kamal Nasrollahi, T. Moeslund","doi":"10.1109/IPTA.2016.7820964","DOIUrl":"https://doi.org/10.1109/IPTA.2016.7820964","url":null,"abstract":"Grabcut, an iterative algorithm based on Graph Cut, is a popular foreground segmentation method. However, it suffers from a main drawback: a manual interaction is required in order to start segmenting the image. In this paper, four different methods based on image pairs are used to obtain an initial extraction of the foreground. Then, the obtained initial estimation of the foreground is used as input to the GrabCut algorithm, thus avoiding the need of interaction. Moreover, this paper is focused on passport images, which require an almost pixel-perfect segmentation in order to be a valid photo. Having gathered our own dataset and generated ground truth images, promising results are obtained in terms of F1-scores, with a maximum mean of 0.975 among all the images, improving the performance of GrabCut in all cases. Some future work directions are given for those unsolved issues that were faced, such as the segmentation in hair regions or tests in a non-uniform background scenario.","PeriodicalId":123429,"journal":{"name":"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115659201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bit-stream-based scrambling for regions of interest in H.264/AVC videos with drift reduction","authors":"A. Unterweger, J. D. Cock, A. Uhl","doi":"10.1109/IPTA.2016.7820929","DOIUrl":"https://doi.org/10.1109/IPTA.2016.7820929","url":null,"abstract":"We propose a new scrambling approach for regions of interest in compressed H.264/AVC bit streams. By scrambling at bit stream level and applying drift reduction techniques, we reduce the processing time by up to 45% compared to full re-encoding. Depending on the input video quality, our approach induces an overhead between −0.5 and 1.5% (high resolution sequences) and −0.5 and 3% (low resolution sequences), respectively, to reduce the drift outside the regions of interest. The quality degradation in these regions remains small in most cases, and moderate in a worst-case scenario with a high number of small regions of interest.","PeriodicalId":123429,"journal":{"name":"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130377878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Laaksonen, A. Hannuksela, E. Claridge, P. Fält, M. Hauta-Kasari, H. Uusitalo, L. Lensu
{"title":"Evaluation of feature sensitivity to training data inaccuracy in detection of retinal lesions","authors":"L. Laaksonen, A. Hannuksela, E. Claridge, P. Fält, M. Hauta-Kasari, H. Uusitalo, L. Lensu","doi":"10.1109/IPTA.2016.7820975","DOIUrl":"https://doi.org/10.1109/IPTA.2016.7820975","url":null,"abstract":"Computer aided diagnostic and segmentation tools have become increasingly important in reducing the workload of medical experts performing diagnosis, monitoring and documentation of various eye diseases such as age-related macular degeneration (AMD), diabetic retinopathy (DR) and glaucoma. Supervised methods have been developed for the segmentation and detection of lesions, and the reported performance has been good. The supervised methods, however, need representative data to properly train the classifier. Inaccuracies in the ground truth may have a significant impact on the performance of a supervised method as the training data are not representative. In this study, a quantitative evaluation of the sensitivity of different image features, including colour, texture, edge and higher-level features, to inaccuracy in the ground truth on exudates is presented. A mean decrease of approx. 20% in sensitivity and 13% in specificity was observed when using the most inaccurate training data.","PeriodicalId":123429,"journal":{"name":"2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130771300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}