开放世界场景中未知图像的自动标记过程

IF 0.5 Q4 ENGINEERING, CHEMICAL

Hungarian Journal of Industry and Chemistry Pub Date : 2019-06-27 DOI:10.33927/HJIC-2019-06

Dávid Papp, G. Szűcs

{"title":"开放世界场景中未知图像的自动标记过程","authors":"Dávid Papp, G. Szűcs","doi":"10.33927/HJIC-2019-06","DOIUrl":null,"url":null,"abstract":"Most of the recognition systems presume a controlled, well-defined research setting, where all possible classes that can appear during a test are known a priori. This environment is referred to as the ``closed-world'' model, while the ``open-world'' model implies that unknown classes can be incorporated into a recognition algorithm whilst being predicted. Therefore, recognition systems that operate in the real world have to deal with these unknown categories. Our objective was not only to detect data that originate from categories unseen during training, but to identify similarities between pieces of unknown data and then form new classes by automatically labeling them. Our Double Probability Model was extended by an image clustering algorithm, in which Kernel K-means was used. A new procedure, namely the Cluster Classification algorithm for the detection of unknowns and automated labeling, is proposed. These approaches facilitate the transition from open-set recognition to an open-world problem. The Fisher Vector (FV) was used for the mathematical representation of the images and then a Support Vector Machine introduced as a classifier. The measurement of similarity was based on the FV representations. Experiments were conducted on the Caltech101 and Caltech256 datasets of images and the Rand Index was evaluated over the unknown data. The results showed that our proposed Cluster Classification algorithm was able to yield almost the same Rand Index, even though the number of unknown categories increased.","PeriodicalId":43118,"journal":{"name":"Hungarian Journal of Industry and Chemistry","volume":" ","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2019-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated Labeling Process for Unknown Images in an open-world Scenario\",\"authors\":\"Dávid Papp, G. Szűcs\",\"doi\":\"10.33927/HJIC-2019-06\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most of the recognition systems presume a controlled, well-defined research setting, where all possible classes that can appear during a test are known a priori. This environment is referred to as the ``closed-world'' model, while the ``open-world'' model implies that unknown classes can be incorporated into a recognition algorithm whilst being predicted. Therefore, recognition systems that operate in the real world have to deal with these unknown categories. Our objective was not only to detect data that originate from categories unseen during training, but to identify similarities between pieces of unknown data and then form new classes by automatically labeling them. Our Double Probability Model was extended by an image clustering algorithm, in which Kernel K-means was used. A new procedure, namely the Cluster Classification algorithm for the detection of unknowns and automated labeling, is proposed. These approaches facilitate the transition from open-set recognition to an open-world problem. The Fisher Vector (FV) was used for the mathematical representation of the images and then a Support Vector Machine introduced as a classifier. The measurement of similarity was based on the FV representations. Experiments were conducted on the Caltech101 and Caltech256 datasets of images and the Rand Index was evaluated over the unknown data. The results showed that our proposed Cluster Classification algorithm was able to yield almost the same Rand Index, even though the number of unknown categories increased.\",\"PeriodicalId\":43118,\"journal\":{\"name\":\"Hungarian Journal of Industry and Chemistry\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2019-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Hungarian Journal of Industry and Chemistry\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.33927/HJIC-2019-06\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENGINEERING, CHEMICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Hungarian Journal of Industry and Chemistry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33927/HJIC-2019-06","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}

引用次数: 0

摘要

大多数识别系统都假设了一个可控的、定义明确的研究环境，在这个环境中，测试过程中可能出现的所有类别都是先验已知的。这种环境被称为“封闭世界”模型，而“开放世界”模型意味着未知类可以在被预测的同时被纳入识别算法。因此，在现实世界中运行的识别系统必须处理这些未知类别。我们的目标不仅是检测来自训练过程中看不见的类别的数据，而且是识别未知数据之间的相似性，然后通过自动标记它们来形成新的类别。我们的双概率模型通过图像聚类算法进行了扩展，其中使用了核K-means。提出了一种新的方法，即用于未知检测和自动标记的聚类分类算法。这些方法促进了从开放集认识到开放世界问题的转变。Fisher矢量（FV）用于图像的数学表示，然后引入支持向量机作为分类器。相似性的测量是基于FV表示的。在Caltech101和Caltech256图像数据集上进行实验，并对未知数据评估兰德指数。结果表明，即使未知类别的数量增加，我们提出的聚类分类算法也能够产生几乎相同的兰德指数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automated Labeling Process for Unknown Images in an open-world Scenario

Most of the recognition systems presume a controlled, well-defined research setting, where all possible classes that can appear during a test are known a priori. This environment is referred to as the ``closed-world'' model, while the ``open-world'' model implies that unknown classes can be incorporated into a recognition algorithm whilst being predicted. Therefore, recognition systems that operate in the real world have to deal with these unknown categories. Our objective was not only to detect data that originate from categories unseen during training, but to identify similarities between pieces of unknown data and then form new classes by automatically labeling them. Our Double Probability Model was extended by an image clustering algorithm, in which Kernel K-means was used. A new procedure, namely the Cluster Classification algorithm for the detection of unknowns and automated labeling, is proposed. These approaches facilitate the transition from open-set recognition to an open-world problem. The Fisher Vector (FV) was used for the mathematical representation of the images and then a Support Vector Machine introduced as a classifier. The measurement of similarity was based on the FV representations. Experiments were conducted on the Caltech101 and Caltech256 datasets of images and the Rand Index was evaluated over the unknown data. The results showed that our proposed Cluster Classification algorithm was able to yield almost the same Rand Index, even though the number of unknown categories increased.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Hungarian Journal of Industry and Chemistry ENGINEERING, CHEMICAL-

自引率

50.00%

发文量

审稿时长

6 weeks