2015 IEEE International Conference on Computer Vision (ICCV)最新文献_第3页

Simpler Non-Parametric Methods Provide as Good or Better Results to Multiple-Instance Learning 简单的非参数方法对多实例学习提供了同样好的或更好的结果

2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.299

Ragav Venkatesan, P. S. Chandakkar, Baoxin Li

{"title":"Simpler Non-Parametric Methods Provide as Good or Better Results to Multiple-Instance Learning","authors":"Ragav Venkatesan, P. S. Chandakkar, Baoxin Li","doi":"10.1109/ICCV.2015.299","DOIUrl":"https://doi.org/10.1109/ICCV.2015.299","url":null,"abstract":"Multiple-instance learning (MIL) is a unique learning problem in which training data labels are available only for collections of objects (called bags) instead of individual objects (called instances). A plethora of approaches have been developed to solve this problem in the past years. Popular methods include the diverse density, MILIS and DD-SVM. While having been widely used, these methods, particularly those in computer vision have attempted fairly sophisticated solutions to solve certain unique and particular configurations of the MIL space. In this paper, we analyze the MIL feature space using modified versions of traditional non-parametric techniques like the Parzen window and k-nearest-neighbour, and develop a learning approach employing distances to k-nearest neighbours of a point in the feature space. We show that these methods work as well, if not better than most recently published methods on benchmark datasets. We compare and contrast our analysis with the well-established diverse-density approach and its variants in recent literature, using benchmark datasets including the Musk, Andrews' and Corel datasets, along with a diabetic retinopathy pathology diagnosis dataset. Experimental results demonstrate that, while enjoying an intuitive interpretation and supporting fast learning, these method have the potential of delivering improved performance even for complex data arising from real-world applications.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"8 1","pages":"2605-2613"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82553768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

A Versatile Scene Model with Differentiable Visibility Applied to Generative Pose Estimation 一种具有可微可视性的多用途场景模型用于生成姿态估计

2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.94

Helge Rhodin, Nadia Robertini, Christian Richardt, H. Seidel, C. Theobalt

{"title":"A Versatile Scene Model with Differentiable Visibility Applied to Generative Pose Estimation","authors":"Helge Rhodin, Nadia Robertini, Christian Richardt, H. Seidel, C. Theobalt","doi":"10.1109/ICCV.2015.94","DOIUrl":"https://doi.org/10.1109/ICCV.2015.94","url":null,"abstract":"Generative reconstruction methods compute the 3D configuration (such as pose and/or geometry) of a shape by optimizing the overlap of the projected 3D shape model with images. Proper handling of occlusions is a big challenge, since the visibility function that indicates if a surface point is seen from a camera can often not be formulated in closed form, and is in general discrete and non-differentiable at occlusion boundaries. We present a new scene representation that enables an analytically differentiable closed-form formulation of surface visibility. In contrast to previous methods, this yields smooth, analytically differentiable, and efficient to optimize pose similarity energies with rigorous occlusion handling, fewer local minima, and experimentally verified improved convergence of numerical optimization. The underlying idea is a new image formation model that represents opaque objects by a translucent medium with a smooth Gaussian density distribution which turns visibility into a smooth phenomenon. We demonstrate the advantages of our versatile scene model in several generative pose estimation problems, namely marker-less multi-object pose estimation, marker-less human motion capture with few cameras, and image-based 3D geometry estimation.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"60 1","pages":"765-773"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76247922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 80

Semi-Supervised Normalized Cuts for Image Segmentation 半监督归一化分割图像

2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.200

Selene E. Chew, N. Cahill

引用次数: 35

Predicting Multiple Structured Visual Interpretations 预测多重结构视觉解释

2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.337

Debadeepta Dey, V. Ramakrishna, M. Hebert, J. Bagnell

引用次数: 27

Variational PatchMatch MultiView Reconstruction and Refinement 变分PatchMatch多视图重构与细化

2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.107

Philipp Heise, B. Jensen, S. Klose, Alois Knoll

{"title":"Variational PatchMatch MultiView Reconstruction and Refinement","authors":"Philipp Heise, B. Jensen, S. Klose, Alois Knoll","doi":"10.1109/ICCV.2015.107","DOIUrl":"https://doi.org/10.1109/ICCV.2015.107","url":null,"abstract":"In this work we propose a novel approach to the problem of multi-view stereo reconstruction. Building upon the previously proposed PatchMatch stereo and PM-Huber algorithm we introduce an extension to the multi-view scenario that employs an iterative refinement scheme. Our proposed approach uses an extended and robustified volumetric truncated signed distance function representation, which is advantageous for the fusion of refined depth maps and also for raycasting the current reconstruction estimation together with estimated depth normals into arbitrary camera views. We formulate the combined multi-view stereo reconstruction and refinement as a variational optimization problem. The newly introduced plane based smoothing term in the energy formulation is guided by the current reconstruction confidence and the image contents. Further we propose an extension of the PatchMatch scheme with an additional KLT step to avoid unnecessary sampling iterations. Improper camera poses are corrected by a direct image aligment step that performs robust outlier compensation by means of a recently proposed kernel lifting framework. To speed up the optimization of the variational formulation an adapted scheme is used for faster convergence.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"32 1","pages":"882-890"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88869164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Improving Ferns Ensembles by Sparsifying and Quantising Posterior Probabilities 通过稀疏化和量化后验概率改进蕨类植物集合

2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.467

Antonio L. Rodríguez, V. Sequeira

引用次数: 0

A Matrix Decomposition Perspective to Multiple Graph Matching 从矩阵分解的角度看多图匹配

2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.31

Junchi Yan, Hongteng Xu, H. Zha, Xiaokang Yang, Huanxi Liu, Stephen M. Chu

引用次数: 33

Deep Neural Decision Forests 深度神经决策森林

2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.172

P. Kontschieder, M. Fiterau, A. Criminisi, S. R. Bulò

{"title":"Deep Neural Decision Forests","authors":"P. Kontschieder, M. Fiterau, A. Criminisi, S. R. Bulò","doi":"10.1109/ICCV.2015.172","DOIUrl":"https://doi.org/10.1109/ICCV.2015.172","url":null,"abstract":"We present Deep Neural Decision Forests - a novel approach that unifies classification trees with the representation learning functionality known from deep convolutional networks, by training them in an end-to-end manner. To combine these two worlds, we introduce a stochastic and differentiable decision tree model, which steers the representation learning usually conducted in the initial layers of a (deep) convolutional network. Our model differs from conventional deep networks because a decision forest provides the final predictions and it differs from conventional decision forests since we propose a principled, joint and global optimization of split and leaf node parameters. We show experimental results on benchmark machine learning datasets like MNIST and ImageNet and find on-par or superior results when compared to state-of-the-art deep models. Most remarkably, we obtain Top5-Errors of only 7.84%/6.38% on ImageNet validation data when integrating our forests in a single-crop, single/seven model GoogLeNet architecture, respectively. Thus, even without any form of training data set augmentation we are improving on the 6.67% error obtained by the best GoogLeNet architecture (7 models, 144 crops).","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"6 1","pages":"1467-1475"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79633308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 458

Interpolation on the Manifold of K Component GMMs K分量gmm流形上的插值

2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.330

Hyunwoo J. Kim, N. Adluru, Monami Banerjee, B. Vemuri, Vikas Singh

{"title":"Interpolation on the Manifold of K Component GMMs","authors":"Hyunwoo J. Kim, N. Adluru, Monami Banerjee, B. Vemuri, Vikas Singh","doi":"10.1109/ICCV.2015.330","DOIUrl":"https://doi.org/10.1109/ICCV.2015.330","url":null,"abstract":"Probability density functions (PDFs) are fundamental \"objects\" in mathematics with numerous applications in computer vision, machine learning and medical imaging. The feasibility of basic operations such as computing the distance between two PDFs and estimating a mean of a set of PDFs is a direct function of the representation we choose to work with. In this paper, we study the Gaussian mixture model (GMM) representation of the PDFs motivated by its numerous attractive features. (1) GMMs are arguably more interpretable than, say, square root parameterizations (2) the model complexity can be explicitly controlled by the number of components and (3) they are already widely used in many applications. The main contributions of this paper are numerical algorithms to enable basic operations on such objects that strictly respect their underlying geometry. For instance, when operating with a set of k component GMMs, a first order expectation is that the result of simple operations like interpolation and averaging should provide an object that is also a k component GMM. The literature provides very little guidance on enforcing such requirements systematically. It turns out that these tasks are important internal modules for analysis and processing of a field of ensemble average propagators (EAPs), common in diffusion weighted magnetic resonance imaging. We provide proof of principle experiments showing how the proposed algorithms for interpolation can facilitate statistical analysis of such data, essential to many neuroimaging studies. Separately, we also derive interesting connections of our algorithm with functional spaces of Gaussians, that may be of independent interest.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"222 1","pages":"2884-2892"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83480056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Learning to Transfer: Transferring Latent Task Structures and Its Application to Person-Specific Facial Action Unit Detection 学习迁移:潜在任务结构的迁移及其在个体面部动作单元检测中的应用

2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.430

Timur R. Almaev, Brais Martínez, M. Valstar

{"title":"Learning to Transfer: Transferring Latent Task Structures and Its Application to Person-Specific Facial Action Unit Detection","authors":"Timur R. Almaev, Brais Martínez, M. Valstar","doi":"10.1109/ICCV.2015.430","DOIUrl":"https://doi.org/10.1109/ICCV.2015.430","url":null,"abstract":"In this article we explore the problem of constructing person-specific models for the detection of facial Action Units (AUs), addressing the problem from the point of view of Transfer Learning and Multi-Task Learning. Our starting point is the fact that some expressions, such as smiles, are very easily elicited, annotated, and automatically detected, while others are much harder to elicit and to annotate. We thus consider a novel problem: all AU models for the target subject are to be learnt using person-specific annotated data for a reference AU (AU12 in our case), and no data or little data regarding the target AU. In order to design such a model, we propose a novel Multi-Task Learning and the associated Transfer Learning framework, in which we consider both relations across subjects and AUs. That is to say, we consider a tensor structure among the tasks. Our approach hinges on learning the latent relations among tasks using one single reference AU, and then transferring these latent relations to other AUs. We show that we are able to effectively make use of the annotated data for AU12 when learning other person-specific AU models, even in the absence of data for the target task. Finally, we show the excellent performance of our method when small amounts of annotated data for the target tasks are made available.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"19 2 1","pages":"3774-3782"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83553064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 41