{"title":"Classification of multi-class microarray datasets using a minimizing class-overlapping based ECOC algorithm","authors":"Haiyue Yu, Kunhong Liu","doi":"10.1145/3035012.3035018","DOIUrl":null,"url":null,"abstract":"The classification of multi-class microarray datasets is much more difficult compared with the binary datasets because the former usually consist of unbalanced data with a smaller sample size in each class. Our paper focuses on the multi-class problem, and proposes a new method based on a class-overlapping measure, named as Minimum Class-Overlapping Error-Correcting Output Codes (MCO-ECOC). In this algorithm, important variables are selected through different filter methods firstly. Then, the class overlapping is measured in training sets, the algorithm searches all class splitting schemes, and select the one minimizing the class-overlapping measure. Each column of the coding matrix represents such a splitting scheme. And then all the coding matrixs are combined by eliminating the redundant columns to make the final ensemble system compact. Neural networks are used as binary classifiers. MCO-ECOC algorithm is applied to classify the different multi-class microarray datasets, and the output of each base learner are fused to produce the final decision based on the Hamming distance. The experimental results show that the performance of MCO-ECOC is significantly higher than those obtained by DECOC and Forest ECOC.","PeriodicalId":130142,"journal":{"name":"Proceedings of the 5th International Conference on Bioinformatics and Computational Biology","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Conference on Bioinformatics and Computational Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3035012.3035018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
The classification of multi-class microarray datasets is much more difficult compared with the binary datasets because the former usually consist of unbalanced data with a smaller sample size in each class. Our paper focuses on the multi-class problem, and proposes a new method based on a class-overlapping measure, named as Minimum Class-Overlapping Error-Correcting Output Codes (MCO-ECOC). In this algorithm, important variables are selected through different filter methods firstly. Then, the class overlapping is measured in training sets, the algorithm searches all class splitting schemes, and select the one minimizing the class-overlapping measure. Each column of the coding matrix represents such a splitting scheme. And then all the coding matrixs are combined by eliminating the redundant columns to make the final ensemble system compact. Neural networks are used as binary classifiers. MCO-ECOC algorithm is applied to classify the different multi-class microarray datasets, and the output of each base learner are fused to produce the final decision based on the Hamming distance. The experimental results show that the performance of MCO-ECOC is significantly higher than those obtained by DECOC and Forest ECOC.