Veda Murthy, Le Hou, Dimitris Samaras, Tahsin M Kurc, Joel H Saltz
{"title":"Center-Focusing Multi-task CNN with Injected Features for Classification of Glioma Nuclear Images.","authors":"Veda Murthy, Le Hou, Dimitris Samaras, Tahsin M Kurc, Joel H Saltz","doi":"10.1109/WACV.2017.98","DOIUrl":"10.1109/WACV.2017.98","url":null,"abstract":"<p><p>Classifying the various shapes and attributes of a glioma cell nucleus is crucial for diagnosis and understanding of the disease. We investigate the automated classification of the nuclear shapes and visual attributes of glioma cells, using Convolutional Neural Networks (CNNs) on pathology images of automatically segmented nuclei. We propose three methods that improve the performance of a previously-developed semi-supervised CNN. First, we propose a method that allows the CNN to focus on the most important part of an image-the image's center containing the nucleus. Second, we inject (concatenate) pre-extracted VGG features into an intermediate layer of our Semi-Supervised CNN so that during training, the CNN can learn a set of additional features. Third, we separate the losses of the two groups of target classes (nuclear shapes and attributes) into a single-label loss and a multi-label loss in order to incorporate prior knowledge of inter-label exclusiveness. On a dataset of 2078 images, the combination of the proposed methods reduces the error rate of attribute and shape classification by 21.54% and 15.07% respectively compared to the existing state-of-the-art method on the same dataset.</p>","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"2017 ","pages":"834-841"},"PeriodicalIF":0.0,"publicationDate":"2017-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5988234/pdf/nihms969223.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36205043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recognition of 3D package shapes for single camera metrology","authors":"Ryan Lloyd, Scott McCloskey","doi":"10.1109/WACV.2014.6836113","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836113","url":null,"abstract":"Many applications of 3D object measurement have become commercially viable due to the recent availability of low-cost range cameras such as the Microsoft Kinect. We address the application of measuring an object's dimensions for the purpose of billing in shipping transactions, where high accuracy is required for certification. In particular, we address cases where an object's pose reduces the accuracy with which we can estimate dimensions from a single camera. Because the class of object shapes is limited in the shipping domain, we perform a closed-world recognition in order to determine a shape model which can account for missing parts, and/or to induce the user to reposition the object for higher accuracy. Our experiments demonstrate that the addition of this recognition step significantly improves system accuracy.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"41 1","pages":"99-106"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88394086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sheng Chen, Zhongyuan Feng, Qingkai Lu, Behrooz Mahasseni, Trevor Fiez, Alan Fern, S. Todorovic
{"title":"Play type recognition in real-world football video","authors":"Sheng Chen, Zhongyuan Feng, Qingkai Lu, Behrooz Mahasseni, Trevor Fiez, Alan Fern, S. Todorovic","doi":"10.1109/WACV.2014.6836040","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836040","url":null,"abstract":"This paper presents a vision system for recognizing the sequence of plays in amateur videos of American football games (e.g. offense, defense, kickoff, punt, etc). The system is aimed at reducing user effort in annotating football videos, which are posted on a web service used by over 13,000 high school, college, and professional football teams. Recognizing football plays is particularly challenging in the context of such a web service, due to the huge variations across videos, in terms of camera viewpoint, motion, distance from the field, as well as amateur camerawork quality, and lighting conditions, among other factors. Given a sequence of videos, where each shows a particular play of a football game, we first run noisy play-level detectors on every video. Then, we integrate responses of the play-level detectors with global game-level reasoning which accounts for statistical knowledge about football games. Our empirical results on more than 1450 videos from 10 diverse football games show that our approach is quite effective, and close to being usable in a real-world setting.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"1 1","pages":"652-659"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83172964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Model-based anthropometry: Predicting measurements from 3D human scans in multiple poses","authors":"Aggeliki Tsoli, M. Loper, Michael J. Black","doi":"10.1109/WACV.2014.6836115","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836115","url":null,"abstract":"Extracting anthropometric or tailoring measurements from 3D human body scans is important for applications such as virtual try-on, custom clothing, and online sizing. Existing commercial solutions identify anatomical landmarks on high-resolution 3D scans and then compute distances or circumferences on the scan. Landmark detection is sensitive to acquisition noise (e.g. holes) and these methods require subjects to adopt a specific pose. In contrast, we propose a solution we call model-based anthropometry. We fit a deformable 3D body model to scan data in one or more poses; this model-based fitting is robust to scan noise. This brings the scan into registration with a database of registered body scans. Then, we extract features from the registered model (rather than from the scan); these include, limb lengths, circumferences, and statistical features of global shape. Finally, we learn a mapping from these features to measurements using regularized linear regression. We perform an extensive evaluation using the CAESAR dataset and demonstrate that the accuracy of our method outperforms state-of-the-art methods.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"2 1","pages":"83-90"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/WACV.2014.6836115","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72522246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Repeated constrained sparse coding with partial dictionaries for hyperspectral unmixing","authors":"Naveed Akhtar, F. Shafait, A. Mian","doi":"10.1109/WACV.2014.6836001","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836001","url":null,"abstract":"Hyperspectral images obtained from remote sensing platforms have limited spatial resolution. Thus, each spectra measured at a pixel is usually a mixture of many pure spectral signatures (endmembers) corresponding to different materials on the ground. Hyperspectral unmixing aims at separating these mixed spectra into its constituent end-members. We formulate hyperspectral unmixing as a constrained sparse coding (CSC) problem where unmixing is performed with the help of a library of pure spectral signatures under positivity and summation constraints. We propose two different methods that perform CSC repeatedly over the hyperspectral data. However, the first method, Repeated-CSC (RCSC), systematically neglects a few spectral bands of the data each time it performs the sparse coding. Whereas the second method, Repeated Spectral Derivative (RSD), takes the spectral derivative of the data before the sparse coding stage. The spectral derivative is taken such that it is not operated on a few selected bands. Experiments on simulated and real hyperspectral data and comparison with existing state of the art show that the proposed methods achieve significantly higher accuracy. Our results demonstrate the overall robustness of RCSC to noise and better performance of RSD at high signal to noise ratio.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"44 1","pages":"953-960"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79335744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Segmentation and matching: Towards a robust object detection system","authors":"Jing Huang, Suya You","doi":"10.1109/WACV.2014.6836082","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836082","url":null,"abstract":"This paper focuses on detecting parts in laser-scanned data of a cluttered industrial scene. To achieve the goal, we propose a robust object detection system based on segmentation and matching, as well as an adaptive segmentation algorithm and an efficient pose extraction algorithm based on correspondence filtering. We also propose an overlapping-based criterion that exploits more information of the original point cloud than the number-of-matching criterion that only considers key-points. Experiments show how each component works and the results demonstrate the performance of our system compared to the state of the art.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"30 1","pages":"325-332"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80935878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Introspective semantic segmentation","authors":"Gautam Singh, J. Kosecka","doi":"10.1109/WACV.2014.6836032","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836032","url":null,"abstract":"Traditional approaches for semantic segmentation work in a supervised setting assuming a fixed number of semantic categories and require sufficiently large training sets. The performance of various approaches is often reported in terms of average per pixel class accuracy and global accuracy of the final labeling. When applying the learned models in the practical settings on large amounts of unlabeled data, possibly containing previously unseen categories, it is important to properly quantify their performance by measuring a classifier's introspective capability. We quantify the confidence of the region classifiers in the context of a non-parametric k-nearest neighbor (k-NN) framework for semantic segmentation by using the so called strangeness measure. The proposed measure is evaluated by introducing confidence based image ranking and showing its feasibility on a dataset containing a large number of previously unseen categories.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"12 1","pages":"714-720"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82218125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Mak, Mauricio Hess-Flores, S. Recker, John Douglas Owens, K. Joy
{"title":"GPU-accelerated and efficient multi-view triangulation for scene reconstruction","authors":"J. Mak, Mauricio Hess-Flores, S. Recker, John Douglas Owens, K. Joy","doi":"10.1109/WACV.2014.6836117","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836117","url":null,"abstract":"This paper presents a framework for GPU-accelerated N-view triangulation in multi-view reconstruction that improves processing time and final reprojection error with respect to methods in the literature. The framework uses an algorithm based on optimizing an angular error-based L1 cost function and it is shown how adaptive gradient descent can be applied for convergence. The triangulation algorithm is mapped onto the GPU and two approaches for parallelization are compared: one thread per track and one thread block per track. The better performing approach depends on the number of tracks and the lengths of the tracks in the dataset. Furthermore, the algorithm uses statistical sampling based on confidence levels to successfully reduce the quantity of feature track positions needed to triangulate an entire track. Sampling aids in load balancing for the GPU's SIMD architecture and for exploiting the GPU's memory hierarchy. When compared to a serial implementation, a typical performance increase of 3-4× can be achieved on a 4-core CPU. On a GPU, large track numbers are favorable and an increase of up to 40× can be achieved. Results on real and synthetic data prove that reprojection errors are similar to the best performing current triangulation methods but costing only a fraction of the computation time, allowing for efficient and accurate triangulation of large scenes.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"110 1","pages":"61-68"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88247327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Long Sha, P. Lucey, S. Sridharan, S. Morgan, D. Pease
{"title":"Understanding and analyzing a large collection of archived swimming videos","authors":"Long Sha, P. Lucey, S. Sridharan, S. Morgan, D. Pease","doi":"10.1109/WACV.2014.6836037","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836037","url":null,"abstract":"In elite sports, nearly all performances are captured on video. Despite the massive amounts of video that has been captured in this domain over the last 10-15 years, most of it remains in an “unstructured” or “raw” form, meaning it can only be viewed or manually annotated/tagged with higher-level event labels which is time consuming and subjective. As such, depending on the detail or depth of annotation, the value of the collected repositories of archived data is minimal as it does not lend itself to large-scale analysis and retrieval. One such example is swimming, where each race of a swimmer is captured on a camcorder and in-addition to the split-times (i.e., the time it takes for each lap), stroke rate and stroke-lengths are manually annotated. In this paper, we propose a vision-based system which effectively “digitizes” a large collection of archived swimming races by estimating the location of the swimmer in each frame, as well as detecting the stroke rate. As the videos are captured from moving hand-held cameras which are located at different positions and angles, we show our hierarchical-based approach to tracking the swimmer and their different parts is robust to these issues and allows us to accurately estimate the swimmer location and stroke rates.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"29 1","pages":"674-681"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82766189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real time action recognition using histograms of depth gradients and random decision forests","authors":"H. Rahmani, A. Mahmood, D. Huynh, A. Mian","doi":"10.1109/WACV.2014.6836044","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836044","url":null,"abstract":"We propose an algorithm which combines the discriminative information from depth images as well as from 3D joint positions to achieve high action recognition accuracy. To avoid the suppression of subtle discriminative information and also to handle local occlusions, we compute a vector of many independent local features. Each feature encodes spatiotemporal variations of depth and depth gradients at a specific space-time location in the action volume. Moreover, we encode the dominant skeleton movements by computing a local 3D joint position difference histogram. For each joint, we compute a 3D space-time motion volume which we use as an importance indicator and incorporate in the feature vector for improved action discrimination. To retain only the discriminant features, we train a random decision forest (RDF). The proposed algorithm is evaluated on three standard datasets and compared with nine state-of-the-art algorithms. Experimental results show that, on the average, the proposed algorithm outperform all other algorithms in accuracy and have a processing speed of over 112 frames/second.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"1 1","pages":"626-633"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88786006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}