分类问题中最具信息量样本的动态选择

2010 Ninth International Conference on Machine Learning and Applications Pub Date : 2010-12-12 DOI:10.1109/ICMLA.2010.89

E. Lughofer

{"title":"分类问题中最具信息量样本的动态选择","authors":"E. Lughofer","doi":"10.1109/ICMLA.2010.89","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a dynamic technique for selecting the most informative samples in classification problems as coming in two stages: the first stage conducts sample selection in batch off-line mode based on unsupervised criteria extracted from cluster partitions, the second phase proposes an active learning scheme during on-line adaptation of classifiers in non-stationary environments. This is based on the reliability of the classifiers in their output responses (confidences in their predictions). Both approaches contribute to a reduction of the annotation effort for operators, as operators only have to label/give feedback on a subset of the off-line/online. At the same time they are able to keep the accuracy on almost the same level as when the classifiers would have been trained on all samples. This will be verified based on real-world data sets from two image classification problems used in on-line surface inspection scenarios.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"On Dynamic Selection of the Most Informative Samples in Classification Problems\",\"authors\":\"E. Lughofer\",\"doi\":\"10.1109/ICMLA.2010.89\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a dynamic technique for selecting the most informative samples in classification problems as coming in two stages: the first stage conducts sample selection in batch off-line mode based on unsupervised criteria extracted from cluster partitions, the second phase proposes an active learning scheme during on-line adaptation of classifiers in non-stationary environments. This is based on the reliability of the classifiers in their output responses (confidences in their predictions). Both approaches contribute to a reduction of the annotation effort for operators, as operators only have to label/give feedback on a subset of the off-line/online. At the same time they are able to keep the accuracy on almost the same level as when the classifiers would have been trained on all samples. This will be verified based on real-world data sets from two image classification problems used in on-line surface inspection scenarios.\",\"PeriodicalId\":336514,\"journal\":{\"name\":\"2010 Ninth International Conference on Machine Learning and Applications\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 Ninth International Conference on Machine Learning and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2010.89\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Ninth International Conference on Machine Learning and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2010.89","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

在本文中，我们提出了一种在分类问题中选择最具信息量样本的动态技术，该技术分为两个阶段:第一阶段基于从聚类分区中提取的无监督标准以批量离线模式进行样本选择，第二阶段在非平稳环境中提出一种在线适应分类器的主动学习方案。这是基于分类器在其输出响应中的可靠性(其预测的置信度)。这两种方法都有助于减少操作员的注释工作，因为操作员只需要对离线/在线的子集进行标记/给出反馈。与此同时，当分类器在所有样本上进行训练时，它们能够保持几乎相同的精度。这将基于在线表面检测场景中使用的两个图像分类问题的真实数据集进行验证。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

On Dynamic Selection of the Most Informative Samples in Classification Problems

In this paper, we propose a dynamic technique for selecting the most informative samples in classification problems as coming in two stages: the first stage conducts sample selection in batch off-line mode based on unsupervised criteria extracted from cluster partitions, the second phase proposes an active learning scheme during on-line adaptation of classifiers in non-stationary environments. This is based on the reliability of the classifiers in their output responses (confidences in their predictions). Both approaches contribute to a reduction of the annotation effort for operators, as operators only have to label/give feedback on a subset of the off-line/online. At the same time they are able to keep the accuracy on almost the same level as when the classifiers would have been trained on all samples. This will be verified based on real-world data sets from two image classification problems used in on-line surface inspection scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2010 Ninth International Conference on Machine Learning and Applications

自引率

0.00%

发文量