{"title":"Quality versus intelligibility: Studying human preferences for american sign language video","authors":"Frank M. Ciaramello, S. Hemami","doi":"10.1117/12.876733","DOIUrl":"https://doi.org/10.1117/12.876733","url":null,"abstract":"Real-time videoconferencing using cellular devices provides natural communication to the Deaf community. For this application, compressed American Sign Language (ASL) video must be evaluated in terms of the intelligibility of the conversation and not in terms of the overall aesthetic quality of the video. This work conducts an experiment to determine the subjective preferences of ASL users in terms of the trade-off between intelligibility and quality when varying the proportion of the bitrate allocated explicitly to the regions of the video containing the signer. A rate-distortion optimization technique, which jointly optimizes for quality and intelligibility according to a user-specified parameter, generates test video pairs for the subjective experiment. Preliminary results suggest that at high bitrates, users prefer videos in which the non-signer regions in the video are encoded with some nominal rate. As the total encoding bitrate decreases, users prefer video in which a greater proportion of the rate is allocated to the signer.","PeriodicalId":210139,"journal":{"name":"2010 Western New York Image Processing Workshop","volume":"23 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126179146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Determining camera pose using free-form lines","authors":"S. Nagarajan, T. Schenk, B. Csathó","doi":"10.1109/WNYIPW.2010.5649747","DOIUrl":"https://doi.org/10.1109/WNYIPW.2010.5649747","url":null,"abstract":"Determining pose or exterior orientation of a camera is an important step in surface reconstruction and subsequent steps such as perceptual organization and object recognition in photogrammetry or computer vision. Pose refers to position and attitude of a perspective camera at the time of exposure. It is generally determined using a set of points in object space and image space. Using free-form lines as a common feature between image and object spaces has always been a challenge. It is basically due to the unavailability of a mathematical model that relates free-form lines in object space, image space and pose of the camera. This paper introduces an innovative mathematical model to determine the camera pose using free-form lines. Everything recorded by a sensor can be considered a discretization of the physical surface [1]. Hence, it is less likely that the same points can be recorded or identified in both image and object space and it is usually easier to find corresponding lines. The method will lead a way to determine pose of images using historical Geographical Information System(GIS) or object space data.","PeriodicalId":210139,"journal":{"name":"2010 Western New York Image Processing Workshop","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121305201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A HVS-driven image segmentation framework using a local segmentation performance measure","authors":"Renbin Peng, P. Varshney","doi":"10.1109/WNYIPW.2010.5649767","DOIUrl":"https://doi.org/10.1109/WNYIPW.2010.5649767","url":null,"abstract":"This paper presents a novel framework for image segmentation. In this framework, image segmentation is considered to be a detection problem, and a “soft” segmentation objective function, in terms of the detection performance measure in local regions, is employed to guide the segmentation procedure. The human visual system information is also incorporated into the segmentation procedure to improve the efficiency of the framework by introducing a contrast sensitivity function-weighting operation in the wavelet domain. Encouraging experimental results are obtained when the algorithm is applied to real-world image data.","PeriodicalId":210139,"journal":{"name":"2010 Western New York Image Processing Workshop","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131637993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Development of a web-based blind test to score and rank hyperspectral classification algorithms","authors":"K. King, J. Kerekes","doi":"10.1109/WNYIPW.2010.5649748","DOIUrl":"https://doi.org/10.1109/WNYIPW.2010.5649748","url":null,"abstract":"Remotely sensed hyperspectral imagery plays an important role in land cover classification by supplying the user with additional spectral data as compared to high-resolution color imagery. The web application described in this paper enables users to test their classification algorithms without the risk of bias by withholding the majority of the true classification data and only providing a small section of the truth data to be used for training user algorithms. After downloading the dataset, users run their classification algorithms and upload their results back to the web application. The blind test site automatically scores and ranks the uploaded result. The Classification Blind Test web application can be found at: http://dirsapps.cis.rit.edu/classtest/.","PeriodicalId":210139,"journal":{"name":"2010 Western New York Image Processing Workshop","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116621825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Fang, Yu-hsin Joyce Chen, R. Zabih, Tsuhan Chen
{"title":"Tree-metrics graph cuts for brain MRI segmentation with tree cutting","authors":"R. Fang, Yu-hsin Joyce Chen, R. Zabih, Tsuhan Chen","doi":"10.1109/WNYIPW.2010.5649772","DOIUrl":"https://doi.org/10.1109/WNYIPW.2010.5649772","url":null,"abstract":"We tackle the problem of brain MRI image segmentation using the tree-metric graph cuts (TM) algorithm, a novel image segmentation algorithm, and introduce a “tree-cutting” method to interpret the labeling returned by the TM algorithm as tissue classification for the input brain MRI image. The approach has three steps: 1) pre-processing, which generates a tree of labels as input to the TM algorithm; 2) a sweep of the TM algorithm, which returns a globally optimal labeling with respect to the tree of labels; 3) post-processing, which involves running the “tree-cutting” method to generate a mapping from labels to tissue classes (GM, WM, CSF), producing a meaningful brain MRI segmentation. The TM algorithm produces a globally optimal labeling on tree metrics in one sweep, unlike conventional methods such as EMS and EM-style geo-cuts, which iterate the expectation maximization algorithm to find hidden patterns and produce only locally optimal labelings. When used with the “tree-cutting” method, the TM algorithm produces brain MRI segmentations that are as good as the Unified Segmentation algorithm used by SPM8, using a much weaker prior. Comparison with the current approaches shows that our method is faster and that our overall segmentation accuracy is better.","PeriodicalId":210139,"journal":{"name":"2010 Western New York Image Processing Workshop","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133081408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rui Li, Preethi Vaidyanathan, Sai Mulpuru, J. Pelz, P. Shi, C. Calvelli, Anne R. Haake
{"title":"Human-centric approaches to image understanding and retrieval","authors":"Rui Li, Preethi Vaidyanathan, Sai Mulpuru, J. Pelz, P. Shi, C. Calvelli, Anne R. Haake","doi":"10.1109/WNYIPW.2010.5649743","DOIUrl":"https://doi.org/10.1109/WNYIPW.2010.5649743","url":null,"abstract":"The amount of digital medical image data is increasing rapidly in terms of both quantity and heterogeneity. There exists a great need to format medical image archives so as to facilitate diagnostics and preventive medicine. To achieve this, in the past few decades great efforts have been made to investigate methods of applying content-based image retrieval (CBIR) techniques to retrieve images. However, several critical challenges remain. Recently, CBIR research has become intertwined with the fundamental problem of image understanding and it is recognized that computing solutions that bridge the “semantic gap” must capture higher-level domain knowledge of medical end users. We are investigating the incorporation of state-of-the-art visual categorization techniques into conventional CBIR approaches. Visual attention deployment strategies of medical experts serve as an objective measure to help us understand the perceptual and conceptual processes involved in identifying key visual features and selecting diagnostic regions of the images. Understanding these processes will inform and direct feature selection approaches on medical images, such as the dermatological images used in our study. We also explore systematic and effective information integration methods of image data and semantic descriptions with the long-term goals of building efficient human-centered multi-modal interactive CBIR systems.","PeriodicalId":210139,"journal":{"name":"2010 Western New York Image Processing Workshop","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127747028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Boosting with stereo features for building facade detection on mobile platforms","authors":"J. Delmerico, Jason J. Corso, P. David","doi":"10.1109/WNYIPW.2010.5649753","DOIUrl":"https://doi.org/10.1109/WNYIPW.2010.5649753","url":null,"abstract":"Boosting has been widely used for discriminative modeling of objects in images. Conventionally, pixel- and patch-based features have been used, but recently, features defined on multilevel aggregate regions were incorporated into the boosting framework, and demonstrated significant improvement in object labeling tasks. In this paper, we further extend the boosting on multilevel aggregates method to incorporate features based on stereo images. Our underlying application is building facade detection on mobile stereo vision platforms. Example features we propose exploit the algebraic constraints of the planar building facades and depth gradient statistics. We've implemented the features and tested the framework on real stereo data.","PeriodicalId":210139,"journal":{"name":"2010 Western New York Image Processing Workshop","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116941410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Print engine color management using customer image content","authors":"Michael W. Elliot, J. Cockburn","doi":"10.1109/WNYIPW.2010.5649758","DOIUrl":"https://doi.org/10.1109/WNYIPW.2010.5649758","url":null,"abstract":"The production of quality color prints requires that color accuracy and reproducibility be maintained to within very tight tolerances when transferred to different media. Variations in the printing process commonly produce color shifts that result in poor color reproduction. The primary function of a color management system is maintaining color quality and consistency. Currently these systems are tuned in the factory by printing a large set of test color patches, measuring them, and making necessary adjustments. This time-consuming procedure should be repeated as needed once the printer leaves the factory. In this work, a color management system that compensates for print color shifts in realtime using feedback from an in-line full-width sensor is proposed. Instead of printing test patches, this novel attempt at color management utilizes the output pixels already rendered in production pages, for a continuous printer characterization. The printed pages are scanned in-line and the results are utilized to update the process by which colorimetric image content is translated into engine specific color separations (e.g. CIELAB->CMYK). The proposed system provides a means to perform automatic printer characterization, by simply printing a set of images that cover the gamut of the printer. Moreover, all of the color conversion features currently utilized in production systems (such as Gray Component Replacement, Gamut Mapping, and Color Smoothing) can be achieved with the proposed system.","PeriodicalId":210139,"journal":{"name":"2010 Western New York Image Processing Workshop","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125127626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abhijit Bhoite, N. Beke, Sashank Nanduri, Timothy Duffy, M. Torres
{"title":"Advanced situational awareness and obstacle detection using a monocular camera","authors":"Abhijit Bhoite, N. Beke, Sashank Nanduri, Timothy Duffy, M. Torres","doi":"10.1109/WNYIPW.2010.5649763","DOIUrl":"https://doi.org/10.1109/WNYIPW.2010.5649763","url":null,"abstract":"This paper presents a modular approach for a high resolution monocular camera based system to detect, track, and display potential obstacles and navigational threats to soldiers and operators for manned and unmanned ground vehicles. This approach enhances situational awareness by integrating obstacle detection and motion tracking algorithms with virtual pan-zoom-tilt (VPZT) techniques, enabling soldiers to interactively view an arbitrary region of interest (ROI) at the highest captured resolution. Depth determination from a single imager is challenging and an approach for depth information, along with size and motion information is developed to assign a threat level to each obstacle.","PeriodicalId":210139,"journal":{"name":"2010 Western New York Image Processing Workshop","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128944304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical image clustering for analyzing eye tracking videos","authors":"Thomas B. Kinsman, P. Bajorski, J. Pelz","doi":"10.1109/WNYIPW.2010.5649742","DOIUrl":"https://doi.org/10.1109/WNYIPW.2010.5649742","url":null,"abstract":"The classification of a large number of images is a familiar problem to the image processing community. It occurs in consumer photography, bioinformatics, biomedical imaging, surveillance, and in the field of mobile eye-tracking studies. During eye-tracking studies, what a person looks at is recorded, and for each frame what the person looked at must then be analyzed and classified. In many cases the data analysis time restricts the scope of the studies. This paper describes the initial use of hierarchical clustering of these images to minimize the time required during analysis. Pre-clustering the images allows the user to classify a large number of images simultaneously. The success of this method is dependent on meeting requirements for human-computer-interactions, which are also discussed.","PeriodicalId":210139,"journal":{"name":"2010 Western New York Image Processing Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130762643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}