HANOLISTIC: A Hierarchical Automatic Image Annotation System Using Holistic Approach

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops Pub Date : 2009-06-20 DOI:10.1109/CVPRW.2009.5204236

Özge Öztimur Karadag, F. Yarman-Vural

{"title":"HANOLISTIC: A Hierarchical Automatic Image Annotation System Using Holistic Approach","authors":"Özge Öztimur Karadag, F. Yarman-Vural","doi":"10.1109/CVPRW.2009.5204236","DOIUrl":null,"url":null,"abstract":"Automatic image annotation is the process of assigning keywords to digital images depending on the content information. In one sense, it is a mapping from the visual content information to the semantic context information. In this study, we propose a novel approach for automatic image annotation problem, where the annotation is formulated as a multivariate mapping from a set of independent descriptor spaces, representing a whole image, to a set of words, representing class labels. For this purpose, a hierarchical annotation architecture, named as HANOLISTIC (hierarchical image annotation system using holistic approach), is defined with two layers. The first layer, called level 0 consists of annotators each of which is fed by a set of distinct descriptors, extracted from the whole image. This enables us to represent the image at each annotator by a different visual property of a descriptor. Since, we use the whole image, the problematic segmentation process is avoided. Training of each annotator is accomplished by a supervised learning paradigm, where each word is considered as a class label. Note that, this approach is slightly different then the classical training approaches, where each data has a unique label. In the proposed system, since each image has one or more annotating words, we assume that an image belongs to more than one class. The output of the level 0 annotators indicate the membership values of the words in the vocabulary, to belong an image. These membership values from each annotator is, then, aggregated at the second layer to obtain meta level annotator. Finally, a set of words from the vocabulary is selected based on the ranking of the output of meta level. The hierarchical annotation system proposed in this study outperforms state of the art annotation systems based on segmental and holistic approaches.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPRW.2009.5204236","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

Automatic image annotation is the process of assigning keywords to digital images depending on the content information. In one sense, it is a mapping from the visual content information to the semantic context information. In this study, we propose a novel approach for automatic image annotation problem, where the annotation is formulated as a multivariate mapping from a set of independent descriptor spaces, representing a whole image, to a set of words, representing class labels. For this purpose, a hierarchical annotation architecture, named as HANOLISTIC (hierarchical image annotation system using holistic approach), is defined with two layers. The first layer, called level 0 consists of annotators each of which is fed by a set of distinct descriptors, extracted from the whole image. This enables us to represent the image at each annotator by a different visual property of a descriptor. Since, we use the whole image, the problematic segmentation process is avoided. Training of each annotator is accomplished by a supervised learning paradigm, where each word is considered as a class label. Note that, this approach is slightly different then the classical training approaches, where each data has a unique label. In the proposed system, since each image has one or more annotating words, we assume that an image belongs to more than one class. The output of the level 0 annotators indicate the membership values of the words in the vocabulary, to belong an image. These membership values from each annotator is, then, aggregated at the second layer to obtain meta level annotator. Finally, a set of words from the vocabulary is selected based on the ranking of the output of meta level. The hierarchical annotation system proposed in this study outperforms state of the art annotation systems based on segmental and holistic approaches.

查看原文本刊更多论文

HANOLISTIC:一种采用整体方法的分层自动图像标注系统

图像自动标注是根据数字图像的内容信息为其分配关键词的过程。从某种意义上说，它是从视觉内容信息到语义上下文信息的映射。在这项研究中，我们提出了一种新的自动图像标注方法，其中标注被表述为从一组独立的描述符空间(代表整个图像)到一组代表类标签的词的多元映射。为此，定义了一个分层注释体系结构，称为HANOLISTIC(使用整体方法的分层图像注释系统)，分为两层。第一层称为0级，由注释器组成，每个注释器由一组从整个图像中提取的不同描述符提供。这使我们能够通过描述符的不同视觉属性来表示每个注释器上的图像。由于我们使用了整个图像，因此避免了有问题的分割过程。每个注释器的训练是通过监督学习范式完成的，其中每个单词被认为是一个类标签。请注意，这种方法与经典的训练方法略有不同，其中每个数据都有一个唯一的标签。在提出的系统中，由于每张图像都有一个或多个注释词，我们假设一张图像属于多个类。0级注释器的输出指示词汇表中属于图像的单词的隶属度值。然后，在第二层聚合来自每个注释器的这些成员值，以获得元级注释器。最后，根据元级输出的排序，从词汇表中选择一组单词。本研究提出的分层注释系统优于基于分段和整体方法的最先进的注释系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

自引率

0.00%

发文量