{"title":"基于相似学习的多类半监督Boosting","authors":"J. Tanha, M. Saberian, M. Someren","doi":"10.1109/ICDM.2013.108","DOIUrl":null,"url":null,"abstract":"In this paper, we consider the multiclass semi-supervised classification problem. A boosting algorithm is proposed to solve the multiclass problem directly. The proposed multiclass approach uses a new multiclass loss function, which includes two terms. The first term is the cost of the multiclass margin and the second term is a regularization term on unlabeled data. The regularization term is used to minimize the inconsistency between the pair wise similarity and the classifier predictions. It assigns the soft labels weighted with the similarity between unlabeled and labeled examples. We then derive a boosting algorithm, named CD-MSSBoost, from the proposed loss function using coordinate gradient descent. The derived algorithm is further used for learning optimal similarity function for a given data. Our experiments on a number of UCI datasets show that CD-MSSBoost outperforms the state-of-the-art methods to multiclass semi-supervised learning.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Multiclass Semi-Supervised Boosting Using Similarity Learning\",\"authors\":\"J. Tanha, M. Saberian, M. Someren\",\"doi\":\"10.1109/ICDM.2013.108\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we consider the multiclass semi-supervised classification problem. A boosting algorithm is proposed to solve the multiclass problem directly. The proposed multiclass approach uses a new multiclass loss function, which includes two terms. The first term is the cost of the multiclass margin and the second term is a regularization term on unlabeled data. The regularization term is used to minimize the inconsistency between the pair wise similarity and the classifier predictions. It assigns the soft labels weighted with the similarity between unlabeled and labeled examples. We then derive a boosting algorithm, named CD-MSSBoost, from the proposed loss function using coordinate gradient descent. The derived algorithm is further used for learning optimal similarity function for a given data. Our experiments on a number of UCI datasets show that CD-MSSBoost outperforms the state-of-the-art methods to multiclass semi-supervised learning.\",\"PeriodicalId\":308676,\"journal\":{\"name\":\"2013 IEEE 13th International Conference on Data Mining\",\"volume\":\"70 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE 13th International Conference on Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM.2013.108\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 13th International Conference on Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2013.108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multiclass Semi-Supervised Boosting Using Similarity Learning
In this paper, we consider the multiclass semi-supervised classification problem. A boosting algorithm is proposed to solve the multiclass problem directly. The proposed multiclass approach uses a new multiclass loss function, which includes two terms. The first term is the cost of the multiclass margin and the second term is a regularization term on unlabeled data. The regularization term is used to minimize the inconsistency between the pair wise similarity and the classifier predictions. It assigns the soft labels weighted with the similarity between unlabeled and labeled examples. We then derive a boosting algorithm, named CD-MSSBoost, from the proposed loss function using coordinate gradient descent. The derived algorithm is further used for learning optimal similarity function for a given data. Our experiments on a number of UCI datasets show that CD-MSSBoost outperforms the state-of-the-art methods to multiclass semi-supervised learning.