{"title":"Discriminative learning of relaxed hierarchy for large-scale visual recognition","authors":"Tianshi Gao, D. Koller","doi":"10.1109/ICCV.2011.6126481","DOIUrl":null,"url":null,"abstract":"In the real visual world, the number of categories a classifier needs to discriminate is on the order of hundreds or thousands. For example, the SUN dataset [24] contains 899 scene categories and ImageNet [6] has 15,589 synsets. Designing a multiclass classifier that is both accurate and fast at test time is an extremely important problem in both machine learning and computer vision communities. To achieve a good trade-off between accuracy and speed, we adopt the relaxed hierarchy structure from [15], where a set of binary classifiers are organized in a tree or DAG (directed acyclic graph) structure. At each node, classes are colored into positive and negative groups which are separated by a binary classifier while a subset of confusing classes is ignored. We color the classes and learn the induced binary classifier simultaneously using a unified and principled max-margin optimization. We provide an analysis on generalization error to justify our design. Our method has been tested on both Caltech-256 (object recognition) [9] and the SUN dataset (scene classification) [24], and shows significant improvement over existing methods.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"181","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2011.6126481","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 181
Abstract
In the real visual world, the number of categories a classifier needs to discriminate is on the order of hundreds or thousands. For example, the SUN dataset [24] contains 899 scene categories and ImageNet [6] has 15,589 synsets. Designing a multiclass classifier that is both accurate and fast at test time is an extremely important problem in both machine learning and computer vision communities. To achieve a good trade-off between accuracy and speed, we adopt the relaxed hierarchy structure from [15], where a set of binary classifiers are organized in a tree or DAG (directed acyclic graph) structure. At each node, classes are colored into positive and negative groups which are separated by a binary classifier while a subset of confusing classes is ignored. We color the classes and learn the induced binary classifier simultaneously using a unified and principled max-margin optimization. We provide an analysis on generalization error to justify our design. Our method has been tested on both Caltech-256 (object recognition) [9] and the SUN dataset (scene classification) [24], and shows significant improvement over existing methods.