{"title":"Flexible, Robust, Scalable Semi-supervised Learning via Reliability Propagation","authors":"Chen Huang, Liangxu Pan, Qinli Yang, Honglian Wang, Junming Shao","doi":"10.1109/ICDM51629.2021.00030","DOIUrl":null,"url":null,"abstract":"Semi-supervised learning aims to generate a model with a better performance using plenty of unlabeled data. However, most existing methods treat unlabeled data equally without considering whether it is safe or not, which may lead to the degradation of prediction performance. In this paper, towards reliable semi-supervised learning, we propose a data-driven algorithm, called Reliability Propagation (RP), to learn the reliability of each unlabeled instance. The basic idea is to take local label regularity as a prior, and then perform reliability propagation on an adaptive graph. As a result, the most reliable unlabeled instances could be selected to construct a safer classifier. Beyond, the distributed RP algorithm is introduced to scale up to large volumes of data. In contrast to existing approaches, RP exploits the structural information and shed light on the soft instance selection for unlabeled data in a classifier-independent way. Experiments on both synthetic and real-world data have demonstrated that RP allows extracting most reliable unlabeled instances and supports a gained prediction performance compared to other algorithms.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM51629.2021.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Semi-supervised learning aims to generate a model with a better performance using plenty of unlabeled data. However, most existing methods treat unlabeled data equally without considering whether it is safe or not, which may lead to the degradation of prediction performance. In this paper, towards reliable semi-supervised learning, we propose a data-driven algorithm, called Reliability Propagation (RP), to learn the reliability of each unlabeled instance. The basic idea is to take local label regularity as a prior, and then perform reliability propagation on an adaptive graph. As a result, the most reliable unlabeled instances could be selected to construct a safer classifier. Beyond, the distributed RP algorithm is introduced to scale up to large volumes of data. In contrast to existing approaches, RP exploits the structural information and shed light on the soft instance selection for unlabeled data in a classifier-independent way. Experiments on both synthetic and real-world data have demonstrated that RP allows extracting most reliable unlabeled instances and supports a gained prediction performance compared to other algorithms.