{"title":"Face recognition using Fisherface algorithm and elastic graph matching","authors":"Hyung-Ji Lee, Wan-Su Lee, Jae-Ho Chung","doi":"10.1109/ICIP.2001.959216","DOIUrl":"https://doi.org/10.1109/ICIP.2001.959216","url":null,"abstract":"This paper proposes a face recognition technique that effectively combines elastic graph matching (EGM) and the Fisherface algorithm. EGM as one of the dynamic link architectures uses not only face-shape but also the gray information of image, and the Fisherface algorithm as a class-specific method is robust about variations such as lighting direction and facial expression. In the proposed face recognition adopting the above two methods, the linear projection per node of an image graph reduces the dimensionality of labeled graph vector and provides a feature space to be used effectively for the classification. In comparison with the conventional method, the proposed approach could obtain satisfactory results from the perspectives of recognition rates and speeds. In particular, we could get maximum recognition rate of 99.3% by the leaving-one-out method for experiments with the Yale face databases.","PeriodicalId":291827,"journal":{"name":"Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)","volume":"151 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133480712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image identification using the segmented Fourier transform and competitive training in the HAVNET neural network","authors":"V. Sujan, M.P. Mulqueen","doi":"10.1109/ICIP.2001.959060","DOIUrl":"https://doi.org/10.1109/ICIP.2001.959060","url":null,"abstract":"An optical modeless image identification algorithm is presented. The system uses the HAusdorff-Voronoi NETwork (HAVNET), an artificial neural network designed for two-dimensional binary pattern recognition. A detailed review of the architecture, the learning equations, and the recognition equations for the HAVNET network are presented. Competitive learning has been implemented in training the network using a nearest-neighbor technique. The image identification system presented in this paper is applied to two tasks: the optical recognition of a set of American sign language signals and identification of grayscale fingerprints. Image preprocessing includes edge enhancement by histogram equalization, application of a Laplacian filter and thresholding. A segmented Hankel and Fourier transformation in polar coordinates is applied to the binary image giving a rotationally and translationally invariant image structure. This preprocessed image employs the HAVNET neural network for successful image identification.","PeriodicalId":291827,"journal":{"name":"Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133693312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Progressive trellis-coded space-frequency quantization for wavelet image coding","authors":"Pierre Seigneurbieux, Zixiang Xiong","doi":"10.1109/ICIP.2001.958965","DOIUrl":"https://doi.org/10.1109/ICIP.2001.958965","url":null,"abstract":"This paper addresses progressive wavelet image coding within the trellis-coded space-frequency quantization (TCSFQ) framework (Xiong and Wu 1999). A method similar to that in Bilgin et al. (1999) is used to approximately invert TCSFQ when decoding at rates lower than the encoding rate. Our experiments show that the loss incurred for progressive coding is within one dB in PSNR and that the progressive coding performance of TCSFQ is competitive with that of the celebrated SPIHT coder (Said and Pearlman 1996) at all rates.","PeriodicalId":291827,"journal":{"name":"Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133431257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Palmprint recognition using crease","authors":"Jun Chen, Changshui Zhang, Gang Rong","doi":"10.1109/ICIP.2001.958094","DOIUrl":"https://doi.org/10.1109/ICIP.2001.958094","url":null,"abstract":"The palmprint has one special salient feature which is not salient in fingerprint. That is the crease. In a palmprint the creases are large in number and comparatively easy to extract. Creases are also approximately stable in a person's whole life, which qualifies them as features in palmprint recognition. We give creases an accurate definition which is fit for algorithm implementation. We devised a rather exquisite algorithm to extract all the creases in a palmprint whose success is mainly from a new different direction computing method and a thorough local analysis and a robust search algorithm. Based on the extracted creases, we devised a robust palmprint matching algorithm which is rotation and translation invariant. The crease extraction results and palmprint matching results show that the crease can be extracted successfully and crease-based palmprint matching is robust and accurate.","PeriodicalId":291827,"journal":{"name":"Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)","volume":"340 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133640009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Blind source separation using multinode sparse representation","authors":"P. Kisilev, M. Zibulevsky, Y. Zeevi","doi":"10.1109/ICIP.2001.958086","DOIUrl":"https://doi.org/10.1109/ICIP.2001.958086","url":null,"abstract":"The blind source separation problem is concerned with extraction of the underlying source signals from a set of their linear mixtures, where the mixing matrix is unknown. It was discovered recently, that exploiting the sparsity of sources in their representation according to some signal dictionary, dramatically improves the quality of separation. It is especially useful in image processing problems, wherein signals possess strong spatial sparsity. We use multiscale transforms, such as wavelet or wavelet packets, to decompose signals into sets of local features with various degrees of sparsity. We use this intrinsic property for selecting the best (most sparse) subsets of features for further separation. Experiments with 1D signals and images demonstrate significant improvement of separation quality.","PeriodicalId":291827,"journal":{"name":"Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132431249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Matching pursuits multiple description coding for wireless video","authors":"X. Tang, A. Zakhor","doi":"10.1109/ICIP.2001.959198","DOIUrl":"https://doi.org/10.1109/ICIP.2001.959198","url":null,"abstract":"Multiple description coding (MDC) is an error resilient source coding scheme that creates multiple bitstreams of approximately equal importance. We develop a 2 description video coding scheme based on the 3 loop structure originally studied in Reibman et al. (1999). We modify the discrete cosine transform structure to the matching pursuits framework and evaluate performance gain using maximum likelihood (ML) enhancement when both descriptions are available. We find that ML enhancement works best for low motion sequences. Performance comparison is made between our MDC scheme and single description coding (SDC) schemes over two-state Markov channels and Rayleigh fading channels. We find that MDC outperforms SDC in bursty slowly varying environments. In the case of Rayleigh fading channels, interleaving helps SDC close the gap and even outperform MDC depending on the amount of interleaving performed, at the expense of additional delay.","PeriodicalId":291827,"journal":{"name":"Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132688798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A low complexity wavelet transform with point-symmetric extension at tile boundaries","authors":"I. Kharitonenko, Xing Zhang","doi":"10.1109/ICIP.2001.958476","DOIUrl":"https://doi.org/10.1109/ICIP.2001.958476","url":null,"abstract":"This paper presents a low-complexity wavelet transform that utilizes the point-symmetric extension at the image tile boundaries. The proposed solution preserves the perfect reconstruction property of the filter banks and deals efficiently with the blocking artifacts when images are lossy compressed. It is shown that the point-symmetric extension at the tile boundaries does not need to be applied explicitly, but instead the equivalent boundary filters can be derived. The lifting-based implementation of the filters provides a very simple way of changing filter parameters at the boundaries that suits both hardware and software platforms A new architecture is proposed to perform wavelet transform of large images. It minimizes the DSPs internal memory requirements as well as the external buffer bandwidth without producing sharp discontinuities between tiles.","PeriodicalId":291827,"journal":{"name":"Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133122277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sensitivity of image-based and texture-based multi-view coding to model accuracy","authors":"M. Magnor, B. Girod","doi":"10.1109/ICIP.2001.958060","DOIUrl":"https://doi.org/10.1109/ICIP.2001.958060","url":null,"abstract":"Multi-view image coding benefits from knowledge of the depicted scene's 3D geometry. To exploit geometry information for compression, two different approaches can be distinguished. In texture-based coding, images are converted to texture maps prior to compression. In image-based predictive coding, geometry is used for disparity compensation and occlusion detection between images. Coding performance of both approaches depends on the accuracy of the available geometry model. Texture-based and image-based coding are compared with regard to the influence of geometry accuracy on coding efficiency. The results are theoretically explained. Experiments with natural as well as synthetic image sets show that texture-based coding is more sensitive to small geometry inaccuracies than image-based coding. For approximate geometry models, image-based coding performs best, while texture-based coding yields superior coding results if scene geometry is exactly known.","PeriodicalId":291827,"journal":{"name":"Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127840170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Jasinschi, N. Dimitrova, T. McGee, L. Agnihotri, J. Zimmerman, Dongge Li
{"title":"Integrated multimedia processing for topic segmentation and classification","authors":"R. Jasinschi, N. Dimitrova, T. McGee, L. Agnihotri, J. Zimmerman, Dongge Li","doi":"10.1109/ICIP.2001.958127","DOIUrl":"https://doi.org/10.1109/ICIP.2001.958127","url":null,"abstract":"We describe integrated multimedia processing for Video Scout, a system that segments and indexes TV programs according to their audio, visual, and transcript information. Video Scout represents a future direction for personal video recorders. In addition to using electronic program guide metadata and a user profile, Scout allows the users to request specific topics within a program. For example, users can request the video clip of the USA president speaking from a half-hour news program. Video Scout has three modules: (i) video pre-processing, (ii) segmentation and indexing, and (iii) storage and user interface. Segmentation and indexing, the core of the system, incorporates a Bayesian framework that integrates information from the audio, visual, and transcript (closed captions) domains. This framework uses three layers to process low, mid, and high-level multimedia information. The high-level layer generates semantic information about TV program topics. This paper describes the elements of the system and presents results from running Video Scout on real TV programs.","PeriodicalId":291827,"journal":{"name":"Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131400493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tracking multiple individuals for video communication","authors":"A. Avanzi, F. Brémond, M. Thonnat","doi":"10.1109/ICIP.2001.958507","DOIUrl":"https://doi.org/10.1109/ICIP.2001.958507","url":null,"abstract":"We propose a new interpretation platform dedicated to video communication. Basically, a video communication system takes an image flow from a camera and broadcasts it in a computer network. The goal of interpretation is to enable the video communication system to adapt automatically the broadcasted image flow, using image filtering, blurring and zooming. For that we need to detect and track individuals in office scenes, then to understand their behaviour. We focus on a new tracking method based on a 3D model of the scene, on explicit models of individuals and on the computation of several possible paths for each individual. The main issues of the tracking algorithm are presented. Finally we show the results of our algorithm for several sequences illustrating office activities in everyday situations.","PeriodicalId":291827,"journal":{"name":"Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205)","volume":"24 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131415241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}