{"title":"分类问题中最具信息量样本的动态选择","authors":"E. Lughofer","doi":"10.1109/ICMLA.2010.89","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a dynamic technique for selecting the most informative samples in classification problems as coming in two stages: the first stage conducts sample selection in batch off-line mode based on unsupervised criteria extracted from cluster partitions, the second phase proposes an active learning scheme during on-line adaptation of classifiers in non-stationary environments. This is based on the reliability of the classifiers in their output responses (confidences in their predictions). Both approaches contribute to a reduction of the annotation effort for operators, as operators only have to label/give feedback on a subset of the off-line/online. At the same time they are able to keep the accuracy on almost the same level as when the classifiers would have been trained on all samples. This will be verified based on real-world data sets from two image classification problems used in on-line surface inspection scenarios.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"On Dynamic Selection of the Most Informative Samples in Classification Problems\",\"authors\":\"E. Lughofer\",\"doi\":\"10.1109/ICMLA.2010.89\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a dynamic technique for selecting the most informative samples in classification problems as coming in two stages: the first stage conducts sample selection in batch off-line mode based on unsupervised criteria extracted from cluster partitions, the second phase proposes an active learning scheme during on-line adaptation of classifiers in non-stationary environments. This is based on the reliability of the classifiers in their output responses (confidences in their predictions). Both approaches contribute to a reduction of the annotation effort for operators, as operators only have to label/give feedback on a subset of the off-line/online. At the same time they are able to keep the accuracy on almost the same level as when the classifiers would have been trained on all samples. This will be verified based on real-world data sets from two image classification problems used in on-line surface inspection scenarios.\",\"PeriodicalId\":336514,\"journal\":{\"name\":\"2010 Ninth International Conference on Machine Learning and Applications\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 Ninth International Conference on Machine Learning and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2010.89\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Ninth International Conference on Machine Learning and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2010.89","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On Dynamic Selection of the Most Informative Samples in Classification Problems
In this paper, we propose a dynamic technique for selecting the most informative samples in classification problems as coming in two stages: the first stage conducts sample selection in batch off-line mode based on unsupervised criteria extracted from cluster partitions, the second phase proposes an active learning scheme during on-line adaptation of classifiers in non-stationary environments. This is based on the reliability of the classifiers in their output responses (confidences in their predictions). Both approaches contribute to a reduction of the annotation effort for operators, as operators only have to label/give feedback on a subset of the off-line/online. At the same time they are able to keep the accuracy on almost the same level as when the classifiers would have been trained on all samples. This will be verified based on real-world data sets from two image classification problems used in on-line surface inspection scenarios.