Adriana Kovashka, Olga Russakovsky, Li Fei-Fei, K. Grauman
{"title":"Crowdsourcing in Computer Vision","authors":"Adriana Kovashka, Olga Russakovsky, Li Fei-Fei, K. Grauman","doi":"10.1561/0600000073","DOIUrl":"https://doi.org/10.1561/0600000073","url":null,"abstract":"Computer vision systems require large amounts of manually annotated data to properly learn challenging visual concepts. Crowdsourcing platforms offer an inexpensive method to capture human knowledge and understanding, for a vast number of visual perception tasks. Crowdsourcing in Computer Vision describes the types of annotations computer vision researchers have collected using crowdsourcing, and how they have ensured that this data is of high quality while annotation effort is minimized. It begins by discussing data collection on both classic vision tasks, such as object recognition, and recent vision tasks, such as visual story-telling. It then summarizes key design decisions for creating effective data collection interfaces and workflows, and presents strategies for intelligently selecting the most important data instances to annotate. It concludes with some thoughts on the future of crowdsourcing in computer vision. Crowdsourcing in Computer Vision provides an overview of how crowdsourcing has been used in computer vision, enabling a computer vision researcher who has previously not collected non-expert data to devise a data collection strategy. It will also be of help to researchers who focus broadly on crowdsourcing to examine how the latter has been applied in computer vision, and to improve the methods that can be employed to ensure the quality and expedience of data collection.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":null,"pages":null},"PeriodicalIF":36.5,"publicationDate":"2016-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80961458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-View Stereo: A Tutorial","authors":"Yasutaka Furukawa, Carlos Hernández","doi":"10.1561/0600000052","DOIUrl":"https://doi.org/10.1561/0600000052","url":null,"abstract":"This tutorial presents a hands-on view of the field of multi-view stereo with a focus on practical algorithms. Multi-view stereo algorithms are able to construct highly detailed 3D models from images alone. They take a possibly very large set of images and construct a 3D plausible geometry that explains the images under some reasonable assumptions, the most important being scene rigidity. The tutorial frames the multiview stereo problem as an image/geometry consistency optimization problem. It describes in detail its main two ingredients: robust implementations of photometric consistency measures, and efficient optimization algorithms. It then presents how these main ingredients are used by some of the most successful algorithms, applied into real applications, and deployed as products in the industry. Finally it describes more advanced approaches exploiting domain-specific knowledge such as structural priors, and gives an overview of the remaining challenges and future research directions.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":null,"pages":null},"PeriodicalIF":36.5,"publicationDate":"2015-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86769042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raghuraman Gopalan, Ruonan Li, Vishal M. Patel, R. Chellappa
{"title":"Domain Adaptation for Visual Recognition","authors":"Raghuraman Gopalan, Ruonan Li, Vishal M. Patel, R. Chellappa","doi":"10.1561/0600000057","DOIUrl":"https://doi.org/10.1561/0600000057","url":null,"abstract":"Domain adaptation is an active, emerging research area that attemptsto address the changes in data distribution across training and testingdatasets. With the availability of a multitude of image acquisition sensors,variations due to illumination, and viewpoint among others, computervision applications present a very natural test bed for evaluatingdomain adaptation methods. In this monograph, we provide a comprehensiveoverview of domain adaptation solutions for visual recognitionproblems. By starting with the problem description and illustrations,we discuss three adaptation scenarios namely, i unsupervised adaptationwhere the \"source domain\" training data is partially labeledand the \"target domain\" test data is unlabeled, ii semi-supervisedadaptation where the target domain also has partial labels, and iiimulti-domain heterogeneous adaptation which studies the previous twosettings with the source and/or target having more than one domain,and accounts for cases where the features used to represent the datain each domain are different. For all these topics we discuss existingadaptation techniques in the literature, which are motivated by theprinciples of max-margin discriminative learning, manifold learning,sparse coding, as well as low-rank representations. These techniqueshave shown improved performance on a variety of applications suchas object recognition, face recognition, activity analysis, concept classification,and person detection. We then conclude by analyzing thechallenges posed by the realm of \"big visual data\", in terms of thegeneralization ability of adaptation algorithms to unconstrained dataacquisition as well as issues related to their computational tractability,and draw parallels with the efforts from vision community on imagetransformation models, and invariant descriptors so as to facilitate improvedunderstanding of vision problems under uncertainty.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":null,"pages":null},"PeriodicalIF":36.5,"publicationDate":"2015-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75037658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Fresh Look at Generalized Sampling","authors":"Diego F. Nehab, Hugues Hoppe","doi":"10.1561/0600000053","DOIUrl":"https://doi.org/10.1561/0600000053","url":null,"abstract":"Discretization and reconstruction are fundamental operations in computer graphics, enabling the conversion between sampled and continuous representations. Major advances in signal-processing research have shown that these operations can often be performed more efficiently by decomposing a filter into two parts: a compactly supported continuous-domain function and a digital filter. This strategy of \"generalized sampling\" has appeared in a few graphics papers, but is largely unexplored in our community. This paper broadly summarizes the key aspects of the framework, and delves into specific applications in graphics. Using new notation, we concisely present and extend several key techniques. In addition, we demonstrate benefits for prefiltering in image downscaling and supersample-based rendering, and present an analysis of the associated variance reduction. We conclude with a qualitative and quantitative comparison of traditional and generalized filters.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":null,"pages":null},"PeriodicalIF":36.5,"publicationDate":"2014-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88465908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dongwoon Lee, Michael Glueck, Azam Khan, E. Fiume, Kenneth R. Jackson
{"title":"Modeling and Simulation of Skeletal Muscle for Computer Graphics: A Survey","authors":"Dongwoon Lee, Michael Glueck, Azam Khan, E. Fiume, Kenneth R. Jackson","doi":"10.1561/0600000036","DOIUrl":"https://doi.org/10.1561/0600000036","url":null,"abstract":"Muscles provide physiological functions to drive body movement and anatomically characterize body shape, making them a crucial component of modeling animated human figures. Substantial effort has been devoted to developing computational models of muscles for the purpose of increasing realism and accuracy in computer graphics and biomechanics. We survey various approaches to model and simulate muscles both morphologically and functionally. Modeling the realistic morphology of muscle requires that muscle deformation be accurately depicted. To this end, several methodologies are presented, including geometrically-based, physically-based, and data-driven approaches. On the other hand, the simulation of physiological muscle functions aims to identify the biomechanical controls responsible for realistic human motion. Estimating these muscle controls has been pursued through static and dynamic simulations. We review and discuss all these approaches, and conclude with suggestions for future research.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":null,"pages":null},"PeriodicalIF":36.5,"publicationDate":"2012-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77481313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Full-Reference Image Quality Metrics: Classification and Evaluation","authors":"Marius Pedersen, J. Hardeberg","doi":"10.1561/0600000037","DOIUrl":"https://doi.org/10.1561/0600000037","url":null,"abstract":"The wide variety of distortions that images are subject to during acquisition, processing, storage, and reproduction can degrade their perceived quality. Since subjective evaluation is time-consuming, expensive, and resource-intensive, objective methods of evaluation have been proposed. One type of these methods, image quality (IQ) metrics, have become very popular and new metrics are proposed continuously. This paper aims to give a survey of one class of metrics, full-reference IQ metrics. First, these IQ metrics were classified into different groups. Second, further IQ metrics from each group were selected and evaluated against six state-of-the-art IQ databases.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":null,"pages":null},"PeriodicalIF":36.5,"publicationDate":"2012-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86840961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning","authors":"A. Criminisi, J. Shotton, E. Konukoglu","doi":"10.1561/0600000035","DOIUrl":"https://doi.org/10.1561/0600000035","url":null,"abstract":"This review presents a unified, efficient model of random decision forests which can be applied to a number of machine learning, computer vision, and medical image analysis tasks. \u0000 \u0000Our model extends existing forest-based techniques as it unifies classification, regression, density estimation, manifold learning, semi-supervised learning, and active learning under the same decision forest framework. This gives us the opportunity to write and optimize the core implementation only once, with application to many diverse tasks. \u0000 \u0000The proposed model may be used both in a discriminative or generative way and may be applied to discrete or continuous, labeled or unlabeled data. \u0000 \u0000The main contributions of this review are: (1) Proposing a unified, probabilistic and efficient model for a variety of learning tasks; (2) Demonstrating margin-maximizing properties of classification forests; (3) Discussing probabilistic regression forests in comparison with other nonlinear regression algorithms; (4) Introducing density forests for estimating probability density functions; (5) Proposing an efficient algorithm for sampling from a density forest; (6) Introducing manifold forests for nonlinear dimensionality reduction; (7) Proposing new algorithms for transductive learning and active learning. Finally, we discuss how alternatives such as random ferns and extremely randomized trees stem from our more general forest model. \u0000 \u0000This document is directed at both students who wish to learn the basics of decision forests, as well as researchers interested in the new contributions. It presents both fundamental and novel concepts in a structured way, with many illustrative examples and real-world applications. Thorough comparisons with state-of-the-art algorithms such as support vector machines, boosting and Gaussian processes are presented and relative advantages and disadvantages discussed. The many synthetic examples and existing commercial applications demonstrate the validity of the proposed model and its flexibility.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":null,"pages":null},"PeriodicalIF":36.5,"publicationDate":"2012-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75042111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Structured Learning and Prediction in Computer Vision","authors":"Sebastian Nowozin, Christoph H. Lampert","doi":"10.1561/0600000033","DOIUrl":"https://doi.org/10.1561/0600000033","url":null,"abstract":"Powerful statistical models that can be learned efficiently from large amounts of data are currently revolutionizing computer vision. These models possess a rich internal structure reflecting task-specific relations and constraints. This monograph introduces the reader to the most popular classes of structured models in computer vision. Our focus is discrete undirected graphical models which we cover in detail together with a description of algorithms for both probabilistic inference and maximum a posteriori inference. We discuss separately recently successful techniques for prediction in general structured models. In the second part of this monograph we describe methods for parameter learning where we distinguish the classic maximum likelihood based methods from the more recent prediction-based parameter learning methods. We highlight developments to enhance current models and discuss kernelized models and latent variable models. To make the monograph more practical and to provide links to further study we provide examples of successful application of many methods in the computer vision literature.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":null,"pages":null},"PeriodicalIF":36.5,"publicationDate":"2011-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75559275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Sturm, S. Ramalingam, J. Tardif, Simone Gasparini, J. Barreto
{"title":"Camera Models and Fundamental Concepts Used in Geometric Computer Vision","authors":"P. Sturm, S. Ramalingam, J. Tardif, Simone Gasparini, J. Barreto","doi":"10.1561/0600000023","DOIUrl":"https://doi.org/10.1561/0600000023","url":null,"abstract":"This survey is mainly motivated by the increased availability and use of panoramic image acquisition devices, in computer vision and various of its applications. Different technologies and different computational models thereof exist and algorithms and theoretical studies for geometric computer vision (\"structure-from-motion\") are often re-developed without highlighting common underlying principles. One of the goals of this survey is to give an overview of image acquisition methods used in computer vision and especially, of the vast number of camera models that have been proposed and investigated over the years, where we try to point out similarities between different models. Results on epipolar and multi-view geometry for different camera models are reviewed as well as various calibration and self-calibration approaches, with an emphasis on non-perspective cameras. We finally describe what we consider are fundamental building blocks for geometric computer vision or structure-from-motion: epipolar geometry, pose and motion estimation, 3D scene modeling, and bundle adjustment. The main goal here is to highlight the main principles of these, which are independent of specific camera models.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":null,"pages":null},"PeriodicalIF":36.5,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82467089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Geodesic Methods in Computer Vision and Graphics","authors":"G. Peyré, M. Pechaud, R. Keriven, L. Cohen","doi":"10.1561/0600000029","DOIUrl":"https://doi.org/10.1561/0600000029","url":null,"abstract":"This monograph reviews both the theory and practice of the numerical computation of geodesic distances on Riemannian manifolds. The notion of Riemannian manifold allows one to define a local metric (a symmetric positive tensor field) that encodes the information about the problem one wishes to solve. This takes into account a local isotropic cost (whether some point should be avoided or not) and a local anisotropy (which direction should be preferred). Using this local tensor field, the geodesic distance is used to solve many problems of practical interest such as segmentation using geodesic balls and Voronoi regions, sampling points at regular geodesic distance or meshing a domain with geodesic Delaunay triangles. The shortest paths for this Riemannian distance, the so-called geodesics, are also important because they follow salient curvilinear structures in the domain. We show several applications of the numerical computation of geodesic distances and shortest paths to problems in surface and shape processing, in particular segmentation, sampling, meshing and comparison of shapes. All the figures from this review paper can be reproduced by following the Numerical Tours of Signal Processing. \u0000 \u0000http://www.ceremade.dauphine.fr/~peyre/numerical-tour/ \u0000 \u0000Several textbooks exist that include description of several manifold methods for image processing, shape and surface representation and computer graphics. In particular, the reader should refer to [42, 147, 208, 209, 213, 255] for fascinating applications of these methods to many important problems in vision and graphics. This review paper is intended to give an updated tour of both foundations and trends in the area of geodesic methods in vision and graphics.","PeriodicalId":45662,"journal":{"name":"Foundations and Trends in Computer Graphics and Vision","volume":null,"pages":null},"PeriodicalIF":36.5,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78263581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}