... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision最新文献

筛选
英文 中文
nocaps: novel object captioning at scale Nocaps:大规模的新对象字幕
Harsh Agrawal, Karan Desai, Yufei Wang, Xinlei Chen, Rishabh Jain, Mark Johnson, Dhruv Batra, Devi Parikh, Stefan Lee, Peter Anderson
{"title":"nocaps: novel object captioning at scale","authors":"Harsh Agrawal, Karan Desai, Yufei Wang, Xinlei Chen, Rishabh Jain, Mark Johnson, Dhruv Batra, Devi Parikh, Stefan Lee, Peter Anderson","doi":"10.1109/ICCV.2019.00904","DOIUrl":"https://doi.org/10.1109/ICCV.2019.00904","url":null,"abstract":"Image captioning models have achieved impressive results on datasets containing limited visual concepts and large amounts of paired image-caption training data. However, if these models are to ever function in the wild, a much larger variety of visual concepts must be learned, ideally from less supervision. To encourage the development of image captioning models that can learn visual concepts from alternative data sources, such as object detection datasets, we present the first large-scale benchmark for this task. Dubbed ‘nocaps’, for novel object captioning at scale, our benchmark consists of 166,100 human-generated captions describing 15,100 images from the Open Images validation and test sets. The associated training data consists of COCO image-caption pairs, plus Open Images image-level labels and object bounding boxes. Since Open Images contains many more classes than COCO, nearly 400 object classes seen in test images have no or very few associated training captions (hence, nocaps). We extend existing novel object captioning models to establish strong baselines for this benchmark and provide analysis to guide future work.","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"63 1","pages":"8947-8956"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89398399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 249
Joint Scale-Spatial Correlation Tracking with Adaptive Rotation Estimation 自适应旋转估计的联合尺度-空间相关跟踪
Mengdan Zhang, Junliang Xing, Jin Gao, Xinchu Shi, Qiang Wang, Weiming Hu
{"title":"Joint Scale-Spatial Correlation Tracking with Adaptive Rotation Estimation","authors":"Mengdan Zhang, Junliang Xing, Jin Gao, Xinchu Shi, Qiang Wang, Weiming Hu","doi":"10.1109/ICCVW.2015.81","DOIUrl":"https://doi.org/10.1109/ICCVW.2015.81","url":null,"abstract":"Boosted by large and standardized benchmark datasets, visual object tracking has made great progress in recent years and brought about many new trackers. Among these trackers, correlation filter based tracking schema exhibits impressive robustness and accuracy. In this work, we present a fully functional correlation filter based tracking algorithm which is able to simultaneously model target appearance changes from spatial displacements, scale variations, and rotation transformations. The proposed tracker first represents the exhaustive template searching in the joint scale and spatial space by a block-circulant matrix. Then, by transferring the target template from the Cartesian coordinate system to the Log-Polar coordinate system, the circulant structure is well preserved for the target even after whole orientation rotation. With these novel representation and transformation, object tracking is efficiently and effectively performed in the joint space with fast Fourier Transform. Experimental results on the VOT 2015 benchmark dataset demonstrate its superior performance over state-of-the-art tracking algorithms.","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"29 1","pages":"595-603"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77623493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Aggregating Local Deep Features for Image Retrieval 基于局部深度特征的图像检索
Artem Babenko, V. Lempitsky
{"title":"Aggregating Local Deep Features for Image Retrieval","authors":"Artem Babenko, V. Lempitsky","doi":"10.1109/ICCV.2015.150","DOIUrl":"https://doi.org/10.1109/ICCV.2015.150","url":null,"abstract":"Several recent works have shown that image descriptors produced by deep convolutional neural networks provide state-of-the-art performance for image classification and retrieval problems. It also has been shown that the activations from the convolutional layers can be interpreted as local features describing particular image regions. These local features can be aggregated using aggregating methods developed for local features (e.g. Fisher vectors), thus providing new powerful global descriptor. In this paper we investigate possible ways to aggregate local deep features to produce compact descriptors for image retrieval. First, we show that deep features and traditional hand-engineered features have quite different distributions of pairwise similarities, hence existing aggregation methods have to be carefully re-evaluated. Such re-evaluation reveals that in contrast to shallow features, the simple aggregation method based on sum pooling provides the best performance for deep convolutional features. This method is efficient, has few parameters, and bears little risk of overfitting when e.g. learning the PCA matrix. In addition, we suggest a simple yet efficient query expansion scheme suitable for the proposed aggregation method. Overall, the new compact global descriptor improves the state-of-the-art on four common benchmarks considerably.","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"3 1","pages":"1269-1277"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81353089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 82
Structural Kernel Learning for Large Scale Multiclass Object Co-detection 大规模多类目标协同检测的结构核学习
Zeeshan Hayder, Xuming He, M. Salzmann
{"title":"Structural Kernel Learning for Large Scale Multiclass Object Co-detection","authors":"Zeeshan Hayder, Xuming He, M. Salzmann","doi":"10.1109/ICCV.2015.302","DOIUrl":"https://doi.org/10.1109/ICCV.2015.302","url":null,"abstract":"Exploiting contextual relationships across images has recently proven key to improve object detection. The resulting object co-detection algorithms, however, fail to exploit the correlations between multiple classes and, for scalability reasons are limited to modeling object instance similarity with relatively low-dimensional hand-crafted features. Here, we address the problem of multiclass object co-detection for large scale datasets. To this end, we formulate co-detection as the joint multiclass labeling of object candidates obtained in a class-independent manner. To exploit the correlations between objects, we build a fully-connected CRF on the candidates, which explicitly incorporates both geometric layout relations across object classes and similarity relations across multiple images. We then introduce a structural boosting algorithm that lets us exploits rich, high-dimensional deep network features to learn object similarity within our fully-connected CRF. Our experiments on PASCAL VOC 2007 and 2012 evidences the benefits of our approach over object detection with RCNN, single-image CRF methods and state-of-the-art co-detection algorithms.","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"10 1","pages":"2632-2640"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80880384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Shape Index Descriptors Applied to Texture-Based Galaxy Analysis 形状索引描述符应用于基于纹理的星系分析
K. S. Pedersen, Kristoffer Stensbo-Smidt, A. Zirm, C. Igel
{"title":"Shape Index Descriptors Applied to Texture-Based Galaxy Analysis","authors":"K. S. Pedersen, Kristoffer Stensbo-Smidt, A. Zirm, C. Igel","doi":"10.1109/ICCV.2013.303","DOIUrl":"https://doi.org/10.1109/ICCV.2013.303","url":null,"abstract":"A texture descriptor based on the shape index and the accompanying curvedness measure is proposed, and it is evaluated for the automated analysis of astronomical image data. A representative sample of images of low-red shift galaxies from the Sloan Digital Sky Survey (SDSS) serves as a test bed. The goal of applying texture descriptors to these data is to extract novel information about galaxies, information which is often lost in more traditional analysis. In this study, we build a regression model for predicting a spectroscopic quantity, the specific star-formation rate (sSFR). As texture features we consider multi-scale gradient orientation histograms as well as multi-scale shape index histograms, which lead to a new descriptor. Our results show that we can successfully predict spectroscopic quantities from the texture in optical multi-band images. We successfully recover the observed bi-modal distribution of galaxies into quiescent and star-forming. The state-of-the-art for predicting the sSFR is a color-based physical model. We significantly improve its accuracy by augmenting the model with texture information. This study is the first step towards enabling the quantification of physical galaxy properties from imaging data alone.","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"67 1","pages":"2440-2447"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75589941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Energy Association Filter for Online Data Association with Missing Data 缺失数据在线关联的能量关联滤波器
E. Abir, Dubuisson Séverine, Béréziat Dominique
{"title":"Energy Association Filter for Online Data Association with Missing Data","authors":"E. Abir, Dubuisson Séverine, Béréziat Dominique","doi":"10.1007/978-3-540-89682-1_18","DOIUrl":"https://doi.org/10.1007/978-3-540-89682-1_18","url":null,"abstract":"","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"25 1","pages":"244-257"},"PeriodicalIF":0.0,"publicationDate":"2007-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86990220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conditional Random Fields for Contextual Human Motion Recognition 上下文人体运动识别的条件随机场
C. Sminchisescu, Atul Kanaujia, Zhiguo Li, Dimitris N. Metaxas
{"title":"Conditional Random Fields for Contextual Human Motion Recognition","authors":"C. Sminchisescu, Atul Kanaujia, Zhiguo Li, Dimitris N. Metaxas","doi":"10.1109/ICCV.2005.59","DOIUrl":"https://doi.org/10.1109/ICCV.2005.59","url":null,"abstract":"We present algorithms for recognizing human motion in monocular video sequences, based on discriminative conditional random field (CRF) and maximum entropy Markov models (MEMM). Existing approaches to this problem typically use generative (joint) structures like the hidden Markov model (HMM). Therefore they have to make simplifying, often unrealistic assumptions on the conditional independence of observations given the motion class labels and cannot accommodate overlapping features or long term contextual dependencies in the observation sequence. In contrast, conditional models like the CRFs seamlessly represent contextual dependencies, support efficient, exact inference using dynamic programming, and their parameters can be trained using convex optimization. We introduce conditional graphical models as complementary tools for human motion recognition and present an extensive set of experiments that show how these typically outperform HMMs in classifying not only diverse human activities like walking, jumping. running, picking or dancing, but also for discriminating among subtle motion styles like normal walk and wander walk","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"38 1","pages":"1808-1815"},"PeriodicalIF":0.0,"publicationDate":"2005-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78294599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 139
An affine invariant deformable shape representation for general curves 一般曲线的仿射不变可变形形状表示
Astrom
{"title":"An affine invariant deformable shape representation for general curves","authors":"Astrom","doi":"10.1109/ICCV.2003.1238477","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238477","url":null,"abstract":"Automatic construction of shape models from examples has been the focus of intense research during the last couple of years. These methods have proved to be useful for shape segmentation, tracking and shape understanding. In this paper novel theory to automate shape modelling is described. The theory is intrinsically defined for curves although curves are infinite dimensional objects. The theory is independent of parameterisation and affine transformations. We suggest a method for implementing the ideas and compare it to minimising the description length of the model (MDL). It turns out that the accuracy of the two methods is comparable. Both the MDL and our approach can get stuck at local minima. Our algorithm is less computational expensive and relatively good solutions are obtained after a few iterations. The MDL is, however, better suited at fine-tuning the parameters given good initial estimates to the problem. It is shown that a combination of the two methods outperforms either on its own.","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"50 1","pages":"1142-1149"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76499372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Very High Accuracy Velocity Estimation using Orientation Tensors Parametric Motion and Simultaneous Segmentation of the Motion Field 基于方向张量、参数运动和运动场同步分割的高精度速度估计
Gunnar Farnebäck
{"title":"Very High Accuracy Velocity Estimation using Orientation Tensors Parametric Motion and Simultaneous Segmentation of the Motion Field","authors":"Gunnar Farnebäck","doi":"10.1109/ICCV.2001.10042","DOIUrl":"https://doi.org/10.1109/ICCV.2001.10042","url":null,"abstract":"In a previous paper, the author presented a new velocity estimation algorithm, using orientation tensors and parametric motion models to provide both fast and accurate results. One of the tradeoffs between accuracy and speed was that no attempts were made to obtain regions of coherent motion when estimating the parametric models. In this paper we show how this can be improved by doing a simultaneous segmentation of the motion field. The resulting algorithm is slower than the previous one, but more accurate. This is shown by evaluation on the well-known Yosemite sequence, where already the previous algorithm showed an accuracy which was substantially better than for earlier published methods. This result has now been improved further.","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"50 1","pages":"171-177"},"PeriodicalIF":0.0,"publicationDate":"2001-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86108139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 140
On Projection Matrices and their Applications in Computer Vision 投影矩阵及其在计算机视觉中的应用
Lior Wolf, A. Shashua
{"title":"On Projection Matrices and their Applications in Computer Vision","authors":"Lior Wolf, A. Shashua","doi":"10.1109/ICCV.2001.10057","DOIUrl":"https://doi.org/10.1109/ICCV.2001.10057","url":null,"abstract":"","PeriodicalId":72022,"journal":{"name":"... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision","volume":"11 1","pages":"412-419"},"PeriodicalIF":0.0,"publicationDate":"2001-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75317134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信