基于多核相关模型的图像标注多标签预测

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops Pub Date : 2009-06-20 DOI:10.1109/CVPRW.2009.5204274

Oksana Yakhnenko, Vasant G Honavar

{"title":"基于多核相关模型的图像标注多标签预测","authors":"Oksana Yakhnenko, Vasant G Honavar","doi":"10.1109/CVPRW.2009.5204274","DOIUrl":null,"url":null,"abstract":"Image annotation is a challenging task that allows to correlate text keywords with an image. In this paper we address the problem of image annotation using Kernel Multiple Linear Regression model. Multiple Linear Regression (MLR) model reconstructs image caption from an image by performing a linear transformation of an image into some semantic space, and then recovers the caption by performing another linear transformation from the semantic space into the label space. The model is trained so that model parameters minimize the error of reconstruction directly. This model is related to Canonical Correlation Analysis (CCA) which maps both images and caption into the semantic space to minimize the distance of mapping in the semantic space. Kernel trick is then used for the MLR resulting in Kernel Multiple Linear Regression model. The solution to KMLR is a solution to the generalized eigen-value problem, related to KCCA (Kernel Canonical Correlation Analysis). We then extend Kernel Multiple Linear Regression and Kernel Canonical Correlation analysis models to multiple kernel setting, to allow various representations of images and captions. We present results for image annotation using Multiple Kernel Learning CCA and MLR on Oliva and Torralba (2001) scene recognition that show kernel selection behaviour.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Multiple label prediction for image annotation with multiple Kernel correlation models\",\"authors\":\"Oksana Yakhnenko, Vasant G Honavar\",\"doi\":\"10.1109/CVPRW.2009.5204274\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Image annotation is a challenging task that allows to correlate text keywords with an image. In this paper we address the problem of image annotation using Kernel Multiple Linear Regression model. Multiple Linear Regression (MLR) model reconstructs image caption from an image by performing a linear transformation of an image into some semantic space, and then recovers the caption by performing another linear transformation from the semantic space into the label space. The model is trained so that model parameters minimize the error of reconstruction directly. This model is related to Canonical Correlation Analysis (CCA) which maps both images and caption into the semantic space to minimize the distance of mapping in the semantic space. Kernel trick is then used for the MLR resulting in Kernel Multiple Linear Regression model. The solution to KMLR is a solution to the generalized eigen-value problem, related to KCCA (Kernel Canonical Correlation Analysis). We then extend Kernel Multiple Linear Regression and Kernel Canonical Correlation analysis models to multiple kernel setting, to allow various representations of images and captions. We present results for image annotation using Multiple Kernel Learning CCA and MLR on Oliva and Torralba (2001) scene recognition that show kernel selection behaviour.\",\"PeriodicalId\":431981,\"journal\":{\"name\":\"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPRW.2009.5204274\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPRW.2009.5204274","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

摘要

图像注释是一项具有挑战性的任务，它允许将文本关键字与图像关联起来。本文利用核多元线性回归模型解决了图像标注问题。多元线性回归(Multiple Linear Regression, MLR)模型通过对图像进行某种语义空间的线性变换来重建图像标题，然后对图像进行另一次从语义空间到标签空间的线性变换来恢复图像标题。对模型进行训练，使模型参数直接减小重构误差。该模型与典型相关分析(CCA)有关，典型相关分析将图像和标题都映射到语义空间中，以最小化语义空间中的映射距离。然后将核技巧用于MLR，从而得到核多元线性回归模型。KMLR的解是广义特征值问题的解，与核典型相关分析(KCCA)有关。然后，我们将核多元线性回归和核典型相关分析模型扩展到多个核设置，以允许图像和标题的各种表示。我们展示了在Oliva和Torralba(2001)场景识别上使用多核学习CCA和MLR进行图像注释的结果，这些结果显示了核选择行为。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multiple label prediction for image annotation with multiple Kernel correlation models

Image annotation is a challenging task that allows to correlate text keywords with an image. In this paper we address the problem of image annotation using Kernel Multiple Linear Regression model. Multiple Linear Regression (MLR) model reconstructs image caption from an image by performing a linear transformation of an image into some semantic space, and then recovers the caption by performing another linear transformation from the semantic space into the label space. The model is trained so that model parameters minimize the error of reconstruction directly. This model is related to Canonical Correlation Analysis (CCA) which maps both images and caption into the semantic space to minimize the distance of mapping in the semantic space. Kernel trick is then used for the MLR resulting in Kernel Multiple Linear Regression model. The solution to KMLR is a solution to the generalized eigen-value problem, related to KCCA (Kernel Canonical Correlation Analysis). We then extend Kernel Multiple Linear Regression and Kernel Canonical Correlation analysis models to multiple kernel setting, to allow various representations of images and captions. We present results for image annotation using Multiple Kernel Learning CCA and MLR on Oliva and Torralba (2001) scene recognition that show kernel selection behaviour.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

自引率

0.00%

发文量