2011 International Conference on Computer Vision最新文献

筛选
英文 中文
Learning to cluster using high order graphical models with latent variables 学习使用具有潜在变量的高阶图形模型聚类
2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126227
N. Komodakis
{"title":"Learning to cluster using high order graphical models with latent variables","authors":"N. Komodakis","doi":"10.1109/ICCV.2011.6126227","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126227","url":null,"abstract":"This paper proposes a very general max-margin learning framework for distance-based clustering. To this end, it formulates clustering as a high order energy minimization problem with latent variables, and applies a dual decomposition approach for training this model. The resulting framework allows learning a very broad class of distance functions, permits an automatic determination of the number of clusters during testing, and is also very efficient. As an additional contribution, we show how our method can be generalized to handle the training of a very broad class of important models in computer vision: arbitrary high-order latent CRFs. Experimental results verify its effectiveness.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80649971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Building a better probabilistic model of images by factorization 用因子分解法建立更好的图像概率模型
2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126473
B. J. Culpepper, Jascha Narain Sohl-Dickstein, B. Olshausen
{"title":"Building a better probabilistic model of images by factorization","authors":"B. J. Culpepper, Jascha Narain Sohl-Dickstein, B. Olshausen","doi":"10.1109/ICCV.2011.6126473","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126473","url":null,"abstract":"We describe a directed bilinear model that learns higher-order groupings among features of natural images. The model represents images in terms of two sets of latent variables: one set of variables represents which feature groups are active, while the other specifies the relative activity within groups. Such a factorized representation is beneficial because it is stable in response to small variations in the placement of features while still preserving information about relative spatial relationships. When trained on MNIST digits, the resulting representation provides state of the art performance in classification using a simple classifier. When trained on natural images, the model learns to group features according to proximity in position, orientation, and scale. The model achieves high log-likelihood (−94 nats), surpassing the current state of the art for natural images achievable with an mcRBM model.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78790246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Image based detection of geometric changes in urban environments 基于图像的城市环境几何变化检测
2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126515
Aparna Taneja, Luca Ballan, M. Pollefeys
{"title":"Image based detection of geometric changes in urban environments","authors":"Aparna Taneja, Luca Ballan, M. Pollefeys","doi":"10.1109/ICCV.2011.6126515","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126515","url":null,"abstract":"In this paper, we propose an efficient technique to detect changes in the geometry of an urban environment using some images observing its current state. The proposed method can be used to significantly optimize the process of updating the 3D model of a city changing over time, by restricting this process to only those areas where changes are detected. With this application in mind, we designed our algorithm to specifically detect only structural changes in the environment, ignoring any changes in its appearance, and ignoring also all the changes which are not relevant for update purposes, such as cars, people etc. As a by-product, the algorithm also provides a coarse geometry of the detected changes. The performance of the proposed method was tested on four different kinds of urban environments and compared with two alternative techniques.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79454954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 89
Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance 小鸟:使用体积原语和姿势标准化外观进行从属分类
2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126238
Ryan Farrell, Om Oza, Ning Zhang, Vlad I. Morariu, Trevor Darrell, L. Davis
{"title":"Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance","authors":"Ryan Farrell, Om Oza, Ning Zhang, Vlad I. Morariu, Trevor Darrell, L. Davis","doi":"10.1109/ICCV.2011.6126238","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126238","url":null,"abstract":"Subordinate-level categorization typically rests on establishing salient distinctions between part-level characteristics of objects, in contrast to basic-level categorization, where the presence or absence of parts is determinative. We develop an approach for subordinate categorization in vision, focusing on an avian domain due to the fine-grained structure of the category taxonomy for this domain. We explore a pose-normalized appearance model based on a volumetric poselet scheme. The variation in shape and appearance properties of these parts across a taxonomy provides the cues needed for subordinate categorization. Training pose detectors requires a relatively large amount of training data per category when done from scratch; using a subordinate-level approach, we exploit a pose classifier trained at the basic-level, and extract part appearance and shape information to build subordinate-level models. Our model associates the underlying image pattern parameters used for detection with corresponding volumetric part location, scale and orientation parameters. These parameters implicitly define a mapping from the image pixels into a pose-normalized appearance space, removing view and pose dependencies, facilitating fine-grained categorization from relatively few training examples.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78492824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 217
Stereo reconstruction using high order likelihood 利用高阶似然进行立体重建
2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126371
H. Jung, Kyoung Mu Lee, Sang Uk Lee
{"title":"Stereo reconstruction using high order likelihood","authors":"H. Jung, Kyoung Mu Lee, Sang Uk Lee","doi":"10.1109/ICCV.2011.6126371","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126371","url":null,"abstract":"Under the popular Bayesian approach, a stereo problem can be formulated by defining likelihood and prior. Likelihoods are often associated with unary terms and priors are defined by pair-wise or higher order cliques in Markov random field (MRF). In this paper, we propose to use high order likelihood model in stereo. Numerous conventional patch based matching methods such as normalized cross correlation, Laplacian of Gaussian, or census filters are designed under the naive assumption that all the pixels of a patch have the same disparities. However, patch-wise cost can be formulated as higher order cliques for MRF so that the matching cost is a function of image patch's disparities. A patch obtained from the projected image by a disparity map should provide a better match without the blurring effect around disparity discontinuities. Among patch-wise high order matching costs, the census filter approach can be easily reduced to pair-wise cliques. The experimental results on census filter-based high order likelihood demonstrate the advantages of high order likelihood over independent identically distributed unary model.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78876754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
N-best maximal decoders for part models 零件模型的n -最佳最大解码器
2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126552
Dennis Park, Deva Ramanan
{"title":"N-best maximal decoders for part models","authors":"Dennis Park, Deva Ramanan","doi":"10.1109/ICCV.2011.6126552","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126552","url":null,"abstract":"We describe a method for generating N-best configurations from part-based models, ensuring that they do not overlap according to some user-provided definition of overlap. We extend previous N-best algorithms from the speech community to incorporate non-maximal suppression cues, such that pixel-shifted copies of a single configuration are not returned. We use approximate algorithms that perform nearly identical to their exact counterparts, but are orders of magnitude faster. Our approach outperforms standard methods for generating multiple object configurations in an image. We use our method to generate multiple pose hypotheses for the problem of human pose estimation from video sequences. We present quantitative results that demonstrate that our framework significantly improves the accuracy of a state-of-the-art pose estimation algorithm.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76691504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 126
Dynamic Manifold Warping for view invariant action recognition 动态流形翘曲的视图不变动作识别
2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126290
Dian Gong, G. Medioni
{"title":"Dynamic Manifold Warping for view invariant action recognition","authors":"Dian Gong, G. Medioni","doi":"10.1109/ICCV.2011.6126290","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126290","url":null,"abstract":"We address the problem of learning view-invariant 3D models of human motion from motion capture data, in order to recognize human actions from a monocular video sequence with arbitrary viewpoint. We propose a Spatio-Temporal Manifold (STM) model to analyze non-linear multivariate time series with latent spatial structure and apply it to recognize actions in the joint-trajectories space. Based on STM, a novel alignment algorithm Dynamic Manifold Warping (DMW) and a robust motion similarity metric are proposed for human action sequences, both in 2D and 3D. DMW extends previous works on spatio-temporal alignment by incorporating manifold learning. We evaluate and compare the approach to state-of-the-art methods on motion capture data and realistic videos. Experimental results demonstrate the effectiveness of our approach, which yields visually appealing alignment results, produces higher action recognition accuracy, and can recognize actions from arbitrary views with partial occlusion.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77376154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 91
Automated corpus callosum extraction via Laplace-Beltrami nodal parcellation and intrinsic geodesic curvature flows on surfaces 基于拉普拉斯-贝尔特拉米节点分割和表面固有测地线曲率流的自动胼胝体提取
2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126476
Rongjie Lai, Yonggang Shi, N. Sicotte, A. Toga
{"title":"Automated corpus callosum extraction via Laplace-Beltrami nodal parcellation and intrinsic geodesic curvature flows on surfaces","authors":"Rongjie Lai, Yonggang Shi, N. Sicotte, A. Toga","doi":"10.1109/ICCV.2011.6126476","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126476","url":null,"abstract":"Corpus callosum (CC) is an important structure in human brain anatomy. In this work, we propose a fully automated and robust approach to extract corpus callosum from T1-weighted structural MR images. The novelty of our method is composed of two key steps. In the first step, we find an initial guess for the curve representation of CC by using the zero level set of the first nontrivial Laplace-Beltrami (LB) eigenfunction on the white matter surface. In the second step, the initial curve is deformed toward the final solution with a geodesic curvature flow on the white matter surface. For numerical solution of the geodesic curvature flow on surfaces, we represent the contour implicitly on a triangular mesh and develop efficient numerical schemes based on finite element method. Because our method depends only on the intrinsic geometry of the white matter surface, it is robust to orientation differences of the brain across population. In our experiments, we validate the proposed algorithm on 32 brains from a clinical study of multiple sclerosis disease and demonstrate that the accuracy of our results.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76641952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Adaptive deconvolutional networks for mid and high level feature learning 用于中高级特征学习的自适应反卷积网络
2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126474
Matthew D. Zeiler, Graham W. Taylor, R. Fergus
{"title":"Adaptive deconvolutional networks for mid and high level feature learning","authors":"Matthew D. Zeiler, Graham W. Taylor, R. Fergus","doi":"10.1109/ICCV.2011.6126474","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126474","url":null,"abstract":"We present a hierarchical model that learns image decompositions via alternating layers of convolutional sparse coding and max pooling. When trained on natural images, the layers of our model capture image information in a variety of forms: low-level edges, mid-level edge junctions, high-level object parts and complete objects. To build our model we rely on a novel inference scheme that ensures each layer reconstructs the input, rather than just the output of the layer directly beneath, as is common with existing hierarchical approaches. This makes it possible to learn multiple layers of representation and we show models with 4 layers, trained on images from the Caltech-101 and 256 datasets. When combined with a standard classifier, features extracted from these models outperform SIFT, as well as representations from other feature learning methods.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77059727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1182
Exploiting the Manhattan-world assumption for extrinsic self-calibration of multi-modal sensor networks 利用曼哈顿世界假设进行多模态传感器网络的外部自定标
2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126337
Marcel Brückner, Joachim Denzler
{"title":"Exploiting the Manhattan-world assumption for extrinsic self-calibration of multi-modal sensor networks","authors":"Marcel Brückner, Joachim Denzler","doi":"10.1109/ICCV.2011.6126337","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126337","url":null,"abstract":"Many new applications are enabled by combining a multi-camera system with a Time-of-Flight (ToF) camera, which is able to simultaneously record intensity and depth images. Classical approaches for self-calibration of a multi-camera system fail to calibrate such a system due to the very different image modalities. In addition, the typical environments of multi-camera systems are man-made and consist primary of only low textured objects. However, at the same time they satisfy the Manhattan-world assumption. We formulate the multi-modal sensor network calibration as a Maximum a Posteriori (MAP) problem and solve it by minimizing the corresponding energy function. First we estimate two separate 3D reconstructions of the environment: one using the pan-tilt unit mounted ToF camera and one using the multi-camera system. We exploit the Manhattan-world assumption and estimate multiple initial calibration hypotheses by registering the three dominant orientations of planes. These hypotheses are used as prior knowledge of a subsequent MAP estimation aiming to align edges that are parallel to these dominant directions. To our knowledge, this is the first self-calibration approach that is able to calibrate a ToF camera with a multi-camera system. Quantitative experiments on real data demonstrate the high accuracy of our approach.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73749958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信