{"title":"An active learning method under very limited initial labeled data","authors":"Yue Zhao, Q. Ji","doi":"10.1109/ICAL.2010.5585339","DOIUrl":null,"url":null,"abstract":"Active learning methods seek to reduce the number of labeled instances needed to train an effective classifier. Most current methods assume the availability of some reasonable amount of initially labeled training data so that the learners can be trained with sufficient quality. However, for many applications, the amount of initial training data is often limited, this will affect the quality of the initial learners, which, in turn, affect the performance of the active learning methods. In this paper, we introduce a new non-parametric active learning strategy that can perform well even under very limited initial training data. Our method selects the query instance that simultaneously maximizes its label uncertainty and the classification accuracy on the unlabelled test data. Our method hence avoids selecting outliers and does not require good initial learner. The experimental results with benchmark datasets show that our method outperforms state of the art methods especially when the amount of the initially labeled data is small or when the quality of the initially labeled data is poor.","PeriodicalId":393739,"journal":{"name":"2010 IEEE International Conference on Automation and Logistics","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Automation and Logistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAL.2010.5585339","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Active learning methods seek to reduce the number of labeled instances needed to train an effective classifier. Most current methods assume the availability of some reasonable amount of initially labeled training data so that the learners can be trained with sufficient quality. However, for many applications, the amount of initial training data is often limited, this will affect the quality of the initial learners, which, in turn, affect the performance of the active learning methods. In this paper, we introduce a new non-parametric active learning strategy that can perform well even under very limited initial training data. Our method selects the query instance that simultaneously maximizes its label uncertainty and the classification accuracy on the unlabelled test data. Our method hence avoids selecting outliers and does not require good initial learner. The experimental results with benchmark datasets show that our method outperforms state of the art methods especially when the amount of the initially labeled data is small or when the quality of the initially labeled data is poor.