{"title":"Oversampling Negative Class Improves Contact Map Prediction","authors":"G. Markowski, Krzysztof Grabczewski, R. Adamczak","doi":"10.18178/ijpmbs.5.4.211-216","DOIUrl":null,"url":null,"abstract":"—In this paper we present a contact map predictor that has been trained using unbalanced training. The training set has been built based on typical, for this problem, feature space: predicted solvent accessibilities and predicted secondary structures. To show that oversampling negative class improves prediction accuracy we have built two predictors that are based on neural networks and decision trees, respectively. The influence of the size of the non-contact class in the training set has been analyzed. We have observed that significantly better results are obtained when the size of the non-contact class is at least 4 times larger than contact class, while the optimal oversampling depends on the type of contacts and learning algorithm used. Our final predictor - PLCT – took part in CASP11 where in one of the category took 3th place. PLCT is available at http://promap.is.umk.pl/. ","PeriodicalId":281523,"journal":{"name":"International Journal of Pharma Medicine and Biological Sciences","volume":"838 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Pharma Medicine and Biological Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18178/ijpmbs.5.4.211-216","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
—In this paper we present a contact map predictor that has been trained using unbalanced training. The training set has been built based on typical, for this problem, feature space: predicted solvent accessibilities and predicted secondary structures. To show that oversampling negative class improves prediction accuracy we have built two predictors that are based on neural networks and decision trees, respectively. The influence of the size of the non-contact class in the training set has been analyzed. We have observed that significantly better results are obtained when the size of the non-contact class is at least 4 times larger than contact class, while the optimal oversampling depends on the type of contacts and learning algorithm used. Our final predictor - PLCT – took part in CASP11 where in one of the category took 3th place. PLCT is available at http://promap.is.umk.pl/.