基于视觉和文本关键词的图像聚类

2007 International Symposium on Computational Intelligence in Robotics and Automation Pub Date : 2007-06-20 DOI:10.1109/CIRA.2007.382923

R. Agrawal, Changhua Wu, W. Grosky, F. Fotouhi

{"title":"基于视觉和文本关键词的图像聚类","authors":"R. Agrawal, Changhua Wu, W. Grosky, F. Fotouhi","doi":"10.1109/CIRA.2007.382923","DOIUrl":null,"url":null,"abstract":"In classical image classification approaches, low-level features have been used. But the high dimensionality of feature spaces poses a challenge in terms of feature selection and distance measurement during the clustering process. In this paper, we propose an approach to generate visual keyword and combine both visual and text keywords of the image to form a multimodal vector for image classification. This multimodality helps in extracting the image to image, text to text and text to image relations. A visual keyword is derived using vector quantization of image tiles. We arrange the visual keywords in a manner analogous to the term-document matrix in information retrieval. The visual keywords when combined with text keywords result in improvement in the quality of classification. We use a recently proposed nonlinear dimensionality reduction technique, diffusion maps, to reduce the dimensionality of the image representation. Our method is evaluated on two public datasets: LabelMe and Corel. The results support the conclusion that the proposed method of combining visual and text keywords is robust and produces good quality clusters.","PeriodicalId":301626,"journal":{"name":"2007 International Symposium on Computational Intelligence in Robotics and Automation","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Image Clustering Using Visual and Text Keywords\",\"authors\":\"R. Agrawal, Changhua Wu, W. Grosky, F. Fotouhi\",\"doi\":\"10.1109/CIRA.2007.382923\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In classical image classification approaches, low-level features have been used. But the high dimensionality of feature spaces poses a challenge in terms of feature selection and distance measurement during the clustering process. In this paper, we propose an approach to generate visual keyword and combine both visual and text keywords of the image to form a multimodal vector for image classification. This multimodality helps in extracting the image to image, text to text and text to image relations. A visual keyword is derived using vector quantization of image tiles. We arrange the visual keywords in a manner analogous to the term-document matrix in information retrieval. The visual keywords when combined with text keywords result in improvement in the quality of classification. We use a recently proposed nonlinear dimensionality reduction technique, diffusion maps, to reduce the dimensionality of the image representation. Our method is evaluated on two public datasets: LabelMe and Corel. The results support the conclusion that the proposed method of combining visual and text keywords is robust and produces good quality clusters.\",\"PeriodicalId\":301626,\"journal\":{\"name\":\"2007 International Symposium on Computational Intelligence in Robotics and Automation\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 International Symposium on Computational Intelligence in Robotics and Automation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIRA.2007.382923\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 International Symposium on Computational Intelligence in Robotics and Automation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIRA.2007.382923","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

在经典的图像分类方法中，已经使用了低级特征。但特征空间的高维性给聚类过程中的特征选择和距离测量带来了挑战。本文提出了一种生成视觉关键字的方法，将图像的视觉关键字和文本关键字结合起来，形成一个多模态向量进行图像分类。这种多模态有助于提取图像到图像、文本到文本以及文本到图像的关系。一个视觉关键词是使用矢量量化图像瓦片派生。我们以一种类似于信息检索中的术语-文档矩阵的方式排列视觉关键字。视觉关键词与文本关键词结合使用，可以提高分类质量。我们使用最近提出的非线性降维技术，扩散图，来降低图像表示的维数。我们的方法在两个公共数据集上进行了评估:LabelMe和Corel。结果表明，本文提出的结合视觉和文本关键词的方法鲁棒性好，聚类质量好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Image Clustering Using Visual and Text Keywords

In classical image classification approaches, low-level features have been used. But the high dimensionality of feature spaces poses a challenge in terms of feature selection and distance measurement during the clustering process. In this paper, we propose an approach to generate visual keyword and combine both visual and text keywords of the image to form a multimodal vector for image classification. This multimodality helps in extracting the image to image, text to text and text to image relations. A visual keyword is derived using vector quantization of image tiles. We arrange the visual keywords in a manner analogous to the term-document matrix in information retrieval. The visual keywords when combined with text keywords result in improvement in the quality of classification. We use a recently proposed nonlinear dimensionality reduction technique, diffusion maps, to reduce the dimensionality of the image representation. Our method is evaluated on two public datasets: LabelMe and Corel. The results support the conclusion that the proposed method of combining visual and text keywords is robust and produces good quality clusters.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2007 International Symposium on Computational Intelligence in Robotics and Automation

自引率

0.00%

发文量