... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision最新文献_第2页

nocaps: novel object captioning at scale Nocaps:大规模的新对象字幕

... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision Pub Date : 2019-01-01 DOI: 10.1109/ICCV.2019.00904

Harsh Agrawal, Karan Desai, Yufei Wang, Xinlei Chen, Rishabh Jain, Mark Johnson, Dhruv Batra, Devi Parikh, Stefan Lee, Peter Anderson

{"title":"nocaps: novel object captioning at scale","authors":"Harsh Agrawal, Karan Desai, Yufei Wang, Xinlei Chen, Rishabh Jain, Mark Johnson, Dhruv Batra, Devi Parikh, Stefan Lee, Peter Anderson","doi":"10.1109/ICCV.2019.00904","DOIUrl":"https://doi.org/10.1109/ICCV.2019.00904","url":null,"abstract":"Image captioning models have achieved impressive results on datasets containing limited visual concepts and large amounts of paired image-caption training data. However, if these models are to ever function in the wild, a much larger variety of visual concepts must be learned, ideally from less supervision. To encourage the development of image captioning models that can learn visual concepts from alternative data sources, such as object detection datasets, we present the first large-scale benchmark for this task. Dubbed ‘nocaps’, for novel object captioning at scale, our benchmark consists of 166,100 human-generated captions describing 15,100 images from the Open Images validation and test sets. The associated training data consists of COCO image-caption pairs, plus Open Images image-level labels and object bounding boxes. Since Open Images contains many more classes than COCO, nearly 400 object classes seen in test images have no or very few associated training captions (hence, nocaps). We extend existing novel object captioning models to establish strong baselines for this benchmark and provide analysis to guide future work.","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"63 1","pages":"8947-8956"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89398399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 249

Joint Scale-Spatial Correlation Tracking with Adaptive Rotation Estimation 自适应旋转估计的联合尺度-空间相关跟踪

... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision Pub Date : 2015-12-07 DOI: 10.1109/ICCVW.2015.81

Mengdan Zhang, Junliang Xing, Jin Gao, Xinchu Shi, Qiang Wang, Weiming Hu

{"title":"Joint Scale-Spatial Correlation Tracking with Adaptive Rotation Estimation","authors":"Mengdan Zhang, Junliang Xing, Jin Gao, Xinchu Shi, Qiang Wang, Weiming Hu","doi":"10.1109/ICCVW.2015.81","DOIUrl":"https://doi.org/10.1109/ICCVW.2015.81","url":null,"abstract":"Boosted by large and standardized benchmark datasets, visual object tracking has made great progress in recent years and brought about many new trackers. Among these trackers, correlation filter based tracking schema exhibits impressive robustness and accuracy. In this work, we present a fully functional correlation filter based tracking algorithm which is able to simultaneously model target appearance changes from spatial displacements, scale variations, and rotation transformations. The proposed tracker first represents the exhaustive template searching in the joint scale and spatial space by a block-circulant matrix. Then, by transferring the target template from the Cartesian coordinate system to the Log-Polar coordinate system, the circulant structure is well preserved for the target even after whole orientation rotation. With these novel representation and transformation, object tracking is efficiently and effectively performed in the joint space with fast Fourier Transform. Experimental results on the VOT 2015 benchmark dataset demonstrate its superior performance over state-of-the-art tracking algorithms.","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"29 1","pages":"595-603"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77623493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 56

Aggregating Local Deep Features for Image Retrieval 基于局部深度特征的图像检索

... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.150

Artem Babenko, V. Lempitsky

{"title":"Aggregating Local Deep Features for Image Retrieval","authors":"Artem Babenko, V. Lempitsky","doi":"10.1109/ICCV.2015.150","DOIUrl":"https://doi.org/10.1109/ICCV.2015.150","url":null,"abstract":"Several recent works have shown that image descriptors produced by deep convolutional neural networks provide state-of-the-art performance for image classification and retrieval problems. It also has been shown that the activations from the convolutional layers can be interpreted as local features describing particular image regions. These local features can be aggregated using aggregating methods developed for local features (e.g. Fisher vectors), thus providing new powerful global descriptor. In this paper we investigate possible ways to aggregate local deep features to produce compact descriptors for image retrieval. First, we show that deep features and traditional hand-engineered features have quite different distributions of pairwise similarities, hence existing aggregation methods have to be carefully re-evaluated. Such re-evaluation reveals that in contrast to shallow features, the simple aggregation method based on sum pooling provides the best performance for deep convolutional features. This method is efficient, has few parameters, and bears little risk of overfitting when e.g. learning the PCA matrix. In addition, we suggest a simple yet efficient query expansion scheme suitable for the proposed aggregation method. Overall, the new compact global descriptor improves the state-of-the-art on four common benchmarks considerably.","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"3 1","pages":"1269-1277"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81353089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 82

Structural Kernel Learning for Large Scale Multiclass Object Co-detection 大规模多类目标协同检测的结构核学习

... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.302

Zeeshan Hayder, Xuming He, M. Salzmann

{"title":"Structural Kernel Learning for Large Scale Multiclass Object Co-detection","authors":"Zeeshan Hayder, Xuming He, M. Salzmann","doi":"10.1109/ICCV.2015.302","DOIUrl":"https://doi.org/10.1109/ICCV.2015.302","url":null,"abstract":"Exploiting contextual relationships across images has recently proven key to improve object detection. The resulting object co-detection algorithms, however, fail to exploit the correlations between multiple classes and, for scalability reasons are limited to modeling object instance similarity with relatively low-dimensional hand-crafted features. Here, we address the problem of multiclass object co-detection for large scale datasets. To this end, we formulate co-detection as the joint multiclass labeling of object candidates obtained in a class-independent manner. To exploit the correlations between objects, we build a fully-connected CRF on the candidates, which explicitly incorporates both geometric layout relations across object classes and similarity relations across multiple images. We then introduce a structural boosting algorithm that lets us exploits rich, high-dimensional deep network features to learn object similarity within our fully-connected CRF. Our experiments on PASCAL VOC 2007 and 2012 evidences the benefits of our approach over object detection with RCNN, single-image CRF methods and state-of-the-art co-detection algorithms.","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"10 1","pages":"2632-2640"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80880384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Shape Index Descriptors Applied to Texture-Based Galaxy Analysis 形状索引描述符应用于基于纹理的星系分析

... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.303

K. S. Pedersen, Kristoffer Stensbo-Smidt, A. Zirm, C. Igel

{"title":"Shape Index Descriptors Applied to Texture-Based Galaxy Analysis","authors":"K. S. Pedersen, Kristoffer Stensbo-Smidt, A. Zirm, C. Igel","doi":"10.1109/ICCV.2013.303","DOIUrl":"https://doi.org/10.1109/ICCV.2013.303","url":null,"abstract":"A texture descriptor based on the shape index and the accompanying curvedness measure is proposed, and it is evaluated for the automated analysis of astronomical image data. A representative sample of images of low-red shift galaxies from the Sloan Digital Sky Survey (SDSS) serves as a test bed. The goal of applying texture descriptors to these data is to extract novel information about galaxies, information which is often lost in more traditional analysis. In this study, we build a regression model for predicting a spectroscopic quantity, the specific star-formation rate (sSFR). As texture features we consider multi-scale gradient orientation histograms as well as multi-scale shape index histograms, which lead to a new descriptor. Our results show that we can successfully predict spectroscopic quantities from the texture in optical multi-band images. We successfully recover the observed bi-modal distribution of galaxies into quiescent and star-forming. The state-of-the-art for predicting the sSFR is a color-based physical model. We significantly improve its accuracy by augmenting the model with texture information. This study is the first step towards enabling the quantification of physical galaxy properties from imaging data alone.","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"67 1","pages":"2440-2447"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75589941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Energy Association Filter for Online Data Association with Missing Data 缺失数据在线关联的能量关联滤波器

... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision Pub Date : 2007-03-08 DOI: 10.1007/978-3-540-89682-1_18

E. Abir, Dubuisson Séverine, Béréziat Dominique

引用次数: 0

Conditional Random Fields for Contextual Human Motion Recognition 上下文人体运动识别的条件随机场

... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision Pub Date : 2005-10-17 DOI: 10.1109/ICCV.2005.59

C. Sminchisescu, Atul Kanaujia, Zhiguo Li, Dimitris N. Metaxas

{"title":"Conditional Random Fields for Contextual Human Motion Recognition","authors":"C. Sminchisescu, Atul Kanaujia, Zhiguo Li, Dimitris N. Metaxas","doi":"10.1109/ICCV.2005.59","DOIUrl":"https://doi.org/10.1109/ICCV.2005.59","url":null,"abstract":"We present algorithms for recognizing human motion in monocular video sequences, based on discriminative conditional random field (CRF) and maximum entropy Markov models (MEMM). Existing approaches to this problem typically use generative (joint) structures like the hidden Markov model (HMM). Therefore they have to make simplifying, often unrealistic assumptions on the conditional independence of observations given the motion class labels and cannot accommodate overlapping features or long term contextual dependencies in the observation sequence. In contrast, conditional models like the CRFs seamlessly represent contextual dependencies, support efficient, exact inference using dynamic programming, and their parameters can be trained using convex optimization. We introduce conditional graphical models as complementary tools for human motion recognition and present an extensive set of experiments that show how these typically outperform HMMs in classifying not only diverse human activities like walking, jumping. running, picking or dancing, but also for discriminating among subtle motion styles like normal walk and wander walk","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"38 1","pages":"1808-1815"},"PeriodicalIF":0.0,"publicationDate":"2005-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78294599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 139

An affine invariant deformable shape representation for general curves 一般曲线的仿射不变可变形形状表示

... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238477

Astrom

引用次数: 13

Very High Accuracy Velocity Estimation using Orientation Tensors Parametric Motion and Simultaneous Segmentation of the Motion Field 基于方向张量、参数运动和运动场同步分割的高精度速度估计

... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision Pub Date : 2001-07-07 DOI: 10.1109/ICCV.2001.10042

Gunnar Farnebäck

引用次数: 140

On Projection Matrices and their Applications in Computer Vision 投影矩阵及其在计算机视觉中的应用

... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision Pub Date : 2001-01-01 DOI: 10.1109/ICCV.2001.10057

Lior Wolf, A. Shashua

引用次数: 28