使用跨媒体关联模型的自动图像注释和检索

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval Pub Date : 2003-07-28 DOI:10.1145/860435.860459

J. Jeon, V. Lavrenko, R. Manmatha

{"title":"使用跨媒体关联模型的自动图像注释和检索","authors":"J. Jeon, V. Lavrenko, R. Manmatha","doi":"10.1145/860435.860459","DOIUrl":null,"url":null,"abstract":"Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labor intensive procedure and hence there has been great interest in coming up with automatic ways to retrieve images based on content. Here, we propose an automatic approach to annotating and retrieving images based on a training set of images. We assume that regions in an image can be described using a small vocabulary of blobs. Blobs are generated from image features using clustering. Given a training set of images with annotations, we show that probabilistic models allow us to predict the probability of generating a word given the blobs in an image. This may be used to automatically annotate and retrieve images given a word as a query. We show that relevance models allow us to derive these probabilities in a natural way. Experiments show that the annotation performance of this cross-media relevance model is almost six times as good (in terms of mean precision) than a model based on word-blob co-occurrence model and twice as good as a state of the art model derived from machine translation. Our approach shows the usefulness of using formal information retrieval models for the task of image annotation and retrieval.","PeriodicalId":209809,"journal":{"name":"Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1342","resultStr":"{\"title\":\"Automatic image annotation and retrieval using cross-media relevance models\",\"authors\":\"J. Jeon, V. Lavrenko, R. Manmatha\",\"doi\":\"10.1145/860435.860459\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labor intensive procedure and hence there has been great interest in coming up with automatic ways to retrieve images based on content. Here, we propose an automatic approach to annotating and retrieving images based on a training set of images. We assume that regions in an image can be described using a small vocabulary of blobs. Blobs are generated from image features using clustering. Given a training set of images with annotations, we show that probabilistic models allow us to predict the probability of generating a word given the blobs in an image. This may be used to automatically annotate and retrieve images given a word as a query. We show that relevance models allow us to derive these probabilities in a natural way. Experiments show that the annotation performance of this cross-media relevance model is almost six times as good (in terms of mean precision) than a model based on word-blob co-occurrence model and twice as good as a state of the art model derived from machine translation. Our approach shows the usefulness of using formal information retrieval models for the task of image annotation and retrieval.\",\"PeriodicalId\":209809,\"journal\":{\"name\":\"Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-07-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1342\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/860435.860459\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/860435.860459","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1342

摘要

图书馆传统上使用手动图像注释进行索引，然后检索它们的图像集合。然而，手动图像注释是一个昂贵且劳动密集型的过程，因此人们对基于内容自动检索图像的方法非常感兴趣。在这里，我们提出了一种基于图像训练集的自动标注和检索图像的方法。我们假设图像中的区域可以用一个小的blobs词汇表来描述。使用聚类从图像特征生成blob。给定带有注释的图像训练集，我们展示了概率模型允许我们预测给定图像中blobs生成单词的概率。这可以用于自动注释和检索图像给定一个词作为查询。我们表明，相关模型允许我们以自然的方式推导这些概率。实验表明，这种跨媒体关联模型的标注性能(就平均精度而言)几乎是基于word-blob共现模型的标注性能的六倍，是基于机器翻译的最先进模型的两倍。我们的方法显示了使用形式信息检索模型来完成图像标注和检索任务的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automatic image annotation and retrieval using cross-media relevance models

Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labor intensive procedure and hence there has been great interest in coming up with automatic ways to retrieve images based on content. Here, we propose an automatic approach to annotating and retrieving images based on a training set of images. We assume that regions in an image can be described using a small vocabulary of blobs. Blobs are generated from image features using clustering. Given a training set of images with annotations, we show that probabilistic models allow us to predict the probability of generating a word given the blobs in an image. This may be used to automatically annotate and retrieve images given a word as a query. We show that relevance models allow us to derive these probabilities in a natural way. Experiments show that the annotation performance of this cross-media relevance model is almost six times as good (in terms of mean precision) than a model based on word-blob co-occurrence model and twice as good as a state of the art model derived from machine translation. Our approach shows the usefulness of using formal information retrieval models for the task of image annotation and retrieval.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval

自引率

0.00%

发文量