Roni Mittelman, Honglak Lee, B. Kuipers, S. Savarese
{"title":"Weakly Supervised Learning of Mid-Level Features with Beta-Bernoulli Process Restricted Boltzmann Machines","authors":"Roni Mittelman, Honglak Lee, B. Kuipers, S. Savarese","doi":"10.1109/CVPR.2013.68","DOIUrl":null,"url":null,"abstract":"The use of semantic attributes in computer vision problems has been gaining increased popularity in recent years. Attributes provide an intermediate feature representation in between low-level features and the class categories, and offer several attractive properties, among which are improved learning of novel categories based on few examples, as well as allowing for zero-shot learning. However, the major caveat is that learning semantic attributes is a laborious task, requiring a significant amount of time and human intervention to provide labels. In order to address this issue, we propose a weakly supervised approach to learn mid-level features, where the only supervision is provided by the category classes of the training examples. We develop a novel extension of the restricted Boltzmann machine (RBM) with Beta-Bernoulli process priors. Unlike the standard RBM, our model uses the class labels to promote more efficient sharing of information by different categories. This tends to improve the generalization performance. By using semantic attributes for which annotations are available, we show that we can find correspondences between the mid-level features that we learn and the labeled attributes. Therefore, the mid-level features have distinct semantic characterization which is very similar to that given by the semantic attributes, even though their labeling was not used during the training process. Our experimental results in object recognition tasks show significant performance gains, outperforming methods which rely on manually labeled semantic attributes.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"66 1","pages":"476-483"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2013.68","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 32
Abstract
The use of semantic attributes in computer vision problems has been gaining increased popularity in recent years. Attributes provide an intermediate feature representation in between low-level features and the class categories, and offer several attractive properties, among which are improved learning of novel categories based on few examples, as well as allowing for zero-shot learning. However, the major caveat is that learning semantic attributes is a laborious task, requiring a significant amount of time and human intervention to provide labels. In order to address this issue, we propose a weakly supervised approach to learn mid-level features, where the only supervision is provided by the category classes of the training examples. We develop a novel extension of the restricted Boltzmann machine (RBM) with Beta-Bernoulli process priors. Unlike the standard RBM, our model uses the class labels to promote more efficient sharing of information by different categories. This tends to improve the generalization performance. By using semantic attributes for which annotations are available, we show that we can find correspondences between the mid-level features that we learn and the labeled attributes. Therefore, the mid-level features have distinct semantic characterization which is very similar to that given by the semantic attributes, even though their labeling was not used during the training process. Our experimental results in object recognition tasks show significant performance gains, outperforming methods which rely on manually labeled semantic attributes.