Liuan Wang, Wei-liang Fan, Yuan He, Jun Sun, Yutaka Katsuyama, Y. Hotta
{"title":"Text detection in natural scene images with user-intention","authors":"Liuan Wang, Wei-liang Fan, Yuan He, Jun Sun, Yutaka Katsuyama, Y. Hotta","doi":"10.1109/ICPR.2014.503","DOIUrl":"https://doi.org/10.1109/ICPR.2014.503","url":null,"abstract":"We propose an accurate and robust coarse-to-fine text detection scheme with user-intention which captures the intrinsic characteristics of natural scene texts. In the coarse detection stage, a double edge detector is designed to estimate the symmetry of stroke and the stroke width, which help segment the foreground. Then the initial user-intention region is extended to generate a coarse bounding box based on the estimated foreground. In the refinement stage, candidate connected components (CCs) from Niblack decomposition, are grouped together by location to form text lines after noise removal and layer selection. Experimental results demonstrate the effectiveness of the proposed method which yields higher performance compared with state-of-the-art methods.","PeriodicalId":388385,"journal":{"name":"2013 IEEE International Conference on Image Processing","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132560956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph based segmentation with minimal user interaction","authors":"Huaizhong Zhang, Ehab Essa, Xianghua Xie","doi":"10.1109/ICIP.2013.6738839","DOIUrl":"https://doi.org/10.1109/ICIP.2013.6738839","url":null,"abstract":"In this paper, we present a graph based segmentation method that only requires a single point from user initialization. We incorporate a new image feature into the segmentation scheme. It is derived from a vector field that takes into account gradient vector interactions across the image domain, and has the simplicity of edge based features but also proves to be a useful region indication in two-level segmentation. Effective vector field diffusion is proposed to deal with excessive image noise. Based on a single user point we unravel the image and transfer the object segmentation into a height field segmentation in polar coordinates, which in effect imposes a star shape prior. The search of a minimum closed set on a node weighted, directed graph produces the segmentation result. Comparative analysis on real world images demonstrates promising performances of the proposed method in segmentation accuracy and its simplicity in user interaction.","PeriodicalId":388385,"journal":{"name":"2013 IEEE International Conference on Image Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127110729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cheng Yang, U. Ugbolue, B. Carse, V. Stanković, L. Stanković, P. Rowe
{"title":"Multiple marker tracking in a single-camera system for gait analysis","authors":"Cheng Yang, U. Ugbolue, B. Carse, V. Stanković, L. Stanković, P. Rowe","doi":"10.1109/ICIP.2013.6738644","DOIUrl":"https://doi.org/10.1109/ICIP.2013.6738644","url":null,"abstract":"Human gait analysis for stroke rehabilitation therapy using video processing tools has become popular in recent years. This paper proposes a single-camera system for capturing gait patterns using a Kalman-Structural-Similarity-based algorithm which tracks multiple markers simultaneously. This algorithm is initialized by obtaining the user-selected blocks in the first frame of each video, and the tracker is implemented by using Structural-Similarity image quality assessment algorithm to detect each marker frame by frame within a search area determined by a discrete Kalman filter. Experimental results show the trajectories of the markers fixed on the joints of a human body. The obtained numerical results are used to generate gait information (e.g., knee joint angle) that is later used for diagnostics. The proposed method aims to explore an alternative and portable way to implement human gait analysis with significantly less cost compared to a state-of-the-art 3D motion capture system.","PeriodicalId":388385,"journal":{"name":"2013 IEEE International Conference on Image Processing","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123131995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image denoising using dual tree statistical models for complex wavelet transform coefficient magnitudes","authors":"P. Hill, A. Achim, D. Bull, M. Al-Mualla","doi":"10.1109/ICIP.2013.6738019","DOIUrl":"https://doi.org/10.1109/ICIP.2013.6738019","url":null,"abstract":"Wavelet shrinkage is a standard technique for denoising natural images. Originally proposed for univariate shrinkage in the Discrete Wavelet Transform (DWT) domain, it has since been optimised through the exploitation of translationally invariant wavelet decompositions such as the Dual-Tree Complex Wavelet Transform (DT-CWT) alongside bivariate analysis techniques that condition the shrinkage on spatially related coefficients across neighbouring scales. These more recent techniques have denoised the real and imaginary components of the DT-CWT coefficients separately. Processing real and imaginary components separately has been found to lead to an increase in the phase noise of the transform which in turn affects denoising performance. On this basis, the work presented in this paper offers improved denoising performance through modelling the bivariate distribution of the coefficient magnitudes. The results were compared to the current state of the art non-local means denoising technique BM3D, showing clear subjective improvements, through the retention of high frequency structural and textural information. The paper also compares objective measures, using both PSNR and the more perceptually valid structural similarity measure (SSIM). Whereas PSNR results were slightly below those for BM3D, those for SSIM showed closer correlation with subjective assessment, indicating improvements over BM3D for most noise levels on the images tested.","PeriodicalId":388385,"journal":{"name":"2013 IEEE International Conference on Image Processing","volume":"2013 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127419755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Anantrasirichai, Lindsay B. Nicholson, J. Morgan, I. Erchova, A. Achim
{"title":"Adaptive-weighted bilateral filtering for optical coherence tomography","authors":"N. Anantrasirichai, Lindsay B. Nicholson, J. Morgan, I. Erchova, A. Achim","doi":"10.1109/ICIP.2013.6738229","DOIUrl":"https://doi.org/10.1109/ICIP.2013.6738229","url":null,"abstract":"This paper presents an image enhancement method for retinal optical coherence tomography (OCT) images. Raw OCT images contain a large amount of speckle which causes images to be grainy and very low contrast. The raw OCT images thus need to be processed before any clinical interpretation is made. We propose a novel method to remove speckle, while preserving useful information contained in each retinal layer. The process starts with multi-scale despeckling based on a dual-tree complex wavelet transform (DT-CWT). Then, we further enhance the OCT image through a smoothing process that uses a novel adaptive-weighted bilateral filter (AWBF). This offers the desirable property of preserving texture within the OCT images. Glaucoma classification results confirm that our method can significantly enhance the clinical usefulness of OCT images.","PeriodicalId":388385,"journal":{"name":"2013 IEEE International Conference on Image Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124219174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Video super-resolution using low rank matrix completion","authors":"Jin Chen, J. Núñez-Yáñez, A. Achim","doi":"10.1109/ICIP.2013.6738283","DOIUrl":"https://doi.org/10.1109/ICIP.2013.6738283","url":null,"abstract":"In this paper, a novel video super-resolution image reconstruction algorithm is proposed. We design a patch-based low rank matrix completion algorithm. The proposed algorithm addresses the problem of generating a high-resolution (HR) image from several low-resolution (LR) images, based on sparse representation and low-rank matrix completion. The approach represents observed LR frames in the form of sparse matrices and rearranges those frames into low dimensional constructions. Experimental results demonstrate that, high-frequency details in the super resolved images are recovered from the LR frames. The gains in terms of PSNR and SSIM are significant.","PeriodicalId":388385,"journal":{"name":"2013 IEEE International Conference on Image Processing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116042050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuji Oishi, R. Kurazume, Y. Iwashita, T. Hasegawa
{"title":"Hole-free texture mapping based on laser reflectivity","authors":"Shuji Oishi, R. Kurazume, Y. Iwashita, T. Hasegawa","doi":"10.1109/ICIP.2013.6738284","DOIUrl":"https://doi.org/10.1109/ICIP.2013.6738284","url":null,"abstract":"For creating a three-dimensional (3D) model of a real object using a laser scanner and a camera, texture mapping is an effective technique to enhance the reality. However, in case that the positions of the camera and the laser scanner differ from each other, some textureless regions (holes) may exist on the object surface where the appearance information is missing due to the occlusion or out-of-sight of the camera. In this paper, we propose a new texture completion technique utilizing laser reflectivity for hole-free texture mapping. The laser reflectivity, which denotes the power of a reflected laser light/pulse, is obtained as by-product of the range information at laser scanning. Since the laser reflectivity captures the appearance property of the target as a camera image, it is reasonable that the regions with similar reflectance properties have similar color textures. Based on this idea, texture information in these holes is copied and pasted from the other texture regions according to the similarity and the order determined by the texture and laser reflectivity. To verify the performance of the proposed technique, we carried out texture completion experiments in real scenes.","PeriodicalId":388385,"journal":{"name":"2013 IEEE International Conference on Image Processing","volume":"242 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116574627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scale-space compression and its application using spectral theory","authors":"G. Koutaki, K. Uchimura","doi":"10.1109/ICIP.2013.6738169","DOIUrl":"https://doi.org/10.1109/ICIP.2013.6738169","url":null,"abstract":"In this paper, we propose the application of principal component analysis (PCA) to scale-spaces. PCA is a standard method used in computer vision tasks such as recognition of eigenfaces. Because the translation of an input image into scale-space is a continuous operation, it requires the extension of conventional finite matrix based PCA to an infinite number of dimensions. Here, we use spectral theory to resolve this infinite eigenproblem through the use of integration, and we propose an approximate solution based on polynomial equations. In order to clarify its eigensolutions, we apply spectral decomposition to gaussian scale-space. As an application of this proposed method we introduce a method for generating gaussian blur images, demonstrating that the accuracy of such an image can be made very high by using an arbitrary scale calculated through simple linear combination.","PeriodicalId":388385,"journal":{"name":"2013 IEEE International Conference on Image Processing","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115232943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A fast multi-view based specular removal approach for pill extraction","authors":"Chengjie Wang, S. Kamata, Lizhuang Ma","doi":"10.1109/ICIP.2013.6738850","DOIUrl":"https://doi.org/10.1109/ICIP.2013.6738850","url":null,"abstract":"This paper presents a novel approach to remove the specular reflections on the transparent plastic medicine package and automatically extract the randomly distributed pills inside. In this approach, three cameras are employed to take images of the package from different viewpoints. And these three images are used as input image set while the output is a series of small images of a single pill. And these images can be directly applied to the traditional single pill recognition algorithms. The experimental results show the reliability of our approach by measuring correct detection rate (100%), false detection rate (0%) and pill separation accuracy (98.4%). And the proposed method processes a set of three 725×725 sized images at 0.15s averagely on a Core i5-2400 3.1GHz PC.","PeriodicalId":388385,"journal":{"name":"2013 IEEE International Conference on Image Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130244685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalable video fusion","authors":"P. Hill, A. Achim, D. Bull","doi":"10.1109/ICIP.2013.6738263","DOIUrl":"https://doi.org/10.1109/ICIP.2013.6738263","url":null,"abstract":"A novel system is introduced that is able to fuse two or more sets of multimodal videos in the transform domain. This is achieved without drift and produces an embedded bitstream that offers fine grain scalability. Previous attempts to fuse in the transform domain have not been possible for video compression systems due to the complications of predictive loops within conventional video encoding. The compression system is based on an optimised spatiotemporal codec using the 3D Discrete Dual-tree Wavelet Transform (DDWT) together with a bit plane encoding method (SPIHT) and a coefficient sparsification process (noise shaping). Together, these methods can efficiently encode a video sequence without the need for motion compensation due to the directional (in space and time) selectivity of the transform. This system offers extremely flexible video fusion in dynamic bandwidth environments where there are variable client receiving capabilities.","PeriodicalId":388385,"journal":{"name":"2013 IEEE International Conference on Image Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127409405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}