David J. Miller, Chu-Fang Lin, G. Kesidis, Christopher M. Collins
{"title":"Improved Fine-Grained Component-Conditional Class Labeling with Active Learning","authors":"David J. Miller, Chu-Fang Lin, G. Kesidis, Christopher M. Collins","doi":"10.1109/ICMLA.2010.8","DOIUrl":null,"url":null,"abstract":"We have recently introduced new generative semi supervised mixtures with more fine-grained class label generation mechanisms than previous methods. Our models combine advantages of semi supervised mixtures, which achieve label extrapolation over a component, and nearest-neighbor (NN)/nearest-prototype (NP) classification, which achieves accurate classification in the vicinity of labeled samples. Our models are advantageous when within-component class proportions are not constant over the feature space region ``owned by'' a component. In this paper, we develop an active learning extension of our fine-grained labeling methods. We propose two new uncertainty sampling methods in comparison with traditional entropy-based uncertainty sampling. Our experiments on a number of UC Irvine data sets show that the proposed active learning methods improve classification accuracy more than standard entropy-based active learning. The proposed methods are particularly advantageous when the labeled percentage is small. We also extend our semi supervised method to allow variable weighting on labeled and unlabeled data likelihood terms. This approach is shown to outperform previous weighting schemes.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Ninth International Conference on Machine Learning and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2010.8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We have recently introduced new generative semi supervised mixtures with more fine-grained class label generation mechanisms than previous methods. Our models combine advantages of semi supervised mixtures, which achieve label extrapolation over a component, and nearest-neighbor (NN)/nearest-prototype (NP) classification, which achieves accurate classification in the vicinity of labeled samples. Our models are advantageous when within-component class proportions are not constant over the feature space region ``owned by'' a component. In this paper, we develop an active learning extension of our fine-grained labeling methods. We propose two new uncertainty sampling methods in comparison with traditional entropy-based uncertainty sampling. Our experiments on a number of UC Irvine data sets show that the proposed active learning methods improve classification accuracy more than standard entropy-based active learning. The proposed methods are particularly advantageous when the labeled percentage is small. We also extend our semi supervised method to allow variable weighting on labeled and unlabeled data likelihood terms. This approach is shown to outperform previous weighting schemes.