N. Settouti, Mostafa El Habib Daho, Mohammed El Amine Lazouni, M. A. Chikh
{"title":"Random forest in semi-supervised learning (Co-Forest)","authors":"N. Settouti, Mostafa El Habib Daho, Mohammed El Amine Lazouni, M. A. Chikh","doi":"10.1109/WOSSPA.2013.6602385","DOIUrl":null,"url":null,"abstract":"The semi-supervised learning has been widely applied in many fields such as medical diagnosis, pattern recognition. The semi supervised learning methods are used to employ unlabelled data in addition to labelled data for better classification of large data sets, where only a small number of labelled examples is available. Ensemble Methods are considered as an effective solution to the problem of dimensionality and can improve the robustness and generalization ability of individual learners. In this paper, we are particularly interested in the overall algorithm Random Forest semi-supervised named Co-Forest for the classification of large biological data. The algorithm is evaluated on its ability to correctly predict the labels of unlabelled examples, and its robustness when the number of labelled examples available decreases.","PeriodicalId":417940,"journal":{"name":"2013 8th International Workshop on Systems, Signal Processing and their Applications (WoSSPA)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 8th International Workshop on Systems, Signal Processing and their Applications (WoSSPA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WOSSPA.2013.6602385","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
The semi-supervised learning has been widely applied in many fields such as medical diagnosis, pattern recognition. The semi supervised learning methods are used to employ unlabelled data in addition to labelled data for better classification of large data sets, where only a small number of labelled examples is available. Ensemble Methods are considered as an effective solution to the problem of dimensionality and can improve the robustness and generalization ability of individual learners. In this paper, we are particularly interested in the overall algorithm Random Forest semi-supervised named Co-Forest for the classification of large biological data. The algorithm is evaluated on its ability to correctly predict the labels of unlabelled examples, and its robustness when the number of labelled examples available decreases.