Automated Labeling Process for Unknown Images in an open-world Scenario

IF 0.5 Q4 ENGINEERING, CHEMICAL
Dávid Papp, G. Szűcs
{"title":"Automated Labeling Process for Unknown Images in an open-world Scenario","authors":"Dávid Papp, G. Szűcs","doi":"10.33927/HJIC-2019-06","DOIUrl":null,"url":null,"abstract":"Most of the recognition systems presume a controlled, well-defined research setting, where all possible classes that can appear during a test are known a priori. This environment is referred to as the ``closed-world'' model, while the ``open-world'' model implies that unknown classes can be incorporated into a recognition algorithm whilst being predicted. Therefore, recognition systems that operate in the real world have to deal with these unknown categories. Our objective was not only to detect data that originate from categories unseen during training, but to identify similarities between pieces of unknown data and then form new classes by automatically labeling them. Our Double Probability Model was extended by an image clustering algorithm, in which Kernel K-means was used. A new procedure, namely the Cluster Classification algorithm for the detection of unknowns and automated labeling, is proposed. These approaches facilitate the transition from open-set recognition to an open-world problem. The Fisher Vector (FV) was used for the mathematical representation of the images and then a Support Vector Machine introduced as a classifier. The measurement of similarity was based on the FV representations. Experiments were conducted on the Caltech101 and Caltech256 datasets of images and the Rand Index was evaluated over the unknown data. The results showed that our proposed Cluster Classification algorithm was able to yield almost the same Rand Index, even though the number of unknown categories increased.","PeriodicalId":43118,"journal":{"name":"Hungarian Journal of Industry and Chemistry","volume":null,"pages":null},"PeriodicalIF":0.5000,"publicationDate":"2019-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Hungarian Journal of Industry and Chemistry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33927/HJIC-2019-06","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Most of the recognition systems presume a controlled, well-defined research setting, where all possible classes that can appear during a test are known a priori. This environment is referred to as the ``closed-world'' model, while the ``open-world'' model implies that unknown classes can be incorporated into a recognition algorithm whilst being predicted. Therefore, recognition systems that operate in the real world have to deal with these unknown categories. Our objective was not only to detect data that originate from categories unseen during training, but to identify similarities between pieces of unknown data and then form new classes by automatically labeling them. Our Double Probability Model was extended by an image clustering algorithm, in which Kernel K-means was used. A new procedure, namely the Cluster Classification algorithm for the detection of unknowns and automated labeling, is proposed. These approaches facilitate the transition from open-set recognition to an open-world problem. The Fisher Vector (FV) was used for the mathematical representation of the images and then a Support Vector Machine introduced as a classifier. The measurement of similarity was based on the FV representations. Experiments were conducted on the Caltech101 and Caltech256 datasets of images and the Rand Index was evaluated over the unknown data. The results showed that our proposed Cluster Classification algorithm was able to yield almost the same Rand Index, even though the number of unknown categories increased.
开放世界场景中未知图像的自动标记过程
大多数识别系统都假设了一个可控的、定义明确的研究环境,在这个环境中,测试过程中可能出现的所有类别都是先验已知的。这种环境被称为“封闭世界”模型,而“开放世界”模型意味着未知类可以在被预测的同时被纳入识别算法。因此,在现实世界中运行的识别系统必须处理这些未知类别。我们的目标不仅是检测来自训练过程中看不见的类别的数据,而且是识别未知数据之间的相似性,然后通过自动标记它们来形成新的类别。我们的双概率模型通过图像聚类算法进行了扩展,其中使用了核K-means。提出了一种新的方法,即用于未知检测和自动标记的聚类分类算法。这些方法促进了从开放集认识到开放世界问题的转变。Fisher矢量(FV)用于图像的数学表示,然后引入支持向量机作为分类器。相似性的测量是基于FV表示的。在Caltech101和Caltech256图像数据集上进行实验,并对未知数据评估兰德指数。结果表明,即使未知类别的数量增加,我们提出的聚类分类算法也能够产生几乎相同的兰德指数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
50.00%
发文量
9
审稿时长
6 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信