{"title":"Minimum Bayesian error probability-based gene subset selection.","authors":"Jian Li, Tian Yu, Jin-Mao Wei","doi":"10.1504/ijdmb.2015.070056","DOIUrl":null,"url":null,"abstract":"<p><p>Sifting functional genes is crucial to the new strategies for drug discovery and prospective patient-tailored therapy. Generally, simply generating gene subset by selecting the top k individually superior genes may obtain an inferior gene combination, for some selected genes may be redundant with respect to some others. In this paper, we propose to select gene subset based on the criterion of minimum Bayesian error probability. The method dynamically evaluates all available genes and sifts only one gene at a time. A gene is selected if its combination with the other selected genes can gain better classification information. Within the generated gene subset, each individual gene is the most discriminative one in comparison with those that classify cancers in the same way as this gene does and different genes are more discriminative in combination than in individual. The genes selected in this way are likely to be functional ones from the system biology perspective, for genes tend to co-regulate rather than regulate individually. Experimental results show that the classifiers induced based on this method are capable of classifying cancers with high accuracy, while only a small number of genes are involved.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":"12 4","pages":"434-50"},"PeriodicalIF":0.2000,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.070056","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Data Mining and Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1504/ijdmb.2015.070056","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Sifting functional genes is crucial to the new strategies for drug discovery and prospective patient-tailored therapy. Generally, simply generating gene subset by selecting the top k individually superior genes may obtain an inferior gene combination, for some selected genes may be redundant with respect to some others. In this paper, we propose to select gene subset based on the criterion of minimum Bayesian error probability. The method dynamically evaluates all available genes and sifts only one gene at a time. A gene is selected if its combination with the other selected genes can gain better classification information. Within the generated gene subset, each individual gene is the most discriminative one in comparison with those that classify cancers in the same way as this gene does and different genes are more discriminative in combination than in individual. The genes selected in this way are likely to be functional ones from the system biology perspective, for genes tend to co-regulate rather than regulate individually. Experimental results show that the classifiers induced based on this method are capable of classifying cancers with high accuracy, while only a small number of genes are involved.
期刊介绍:
Mining bioinformatics data is an emerging area at the intersection between bioinformatics and data mining. The objective of IJDMB is to facilitate collaboration between data mining researchers and bioinformaticians by presenting cutting edge research topics and methodologies in the area of data mining for bioinformatics. This perspective acknowledges the inter-disciplinary nature of research in data mining and bioinformatics and provides a unified forum for researchers/practitioners/students/policy makers to share the latest research and developments in this fast growing multi-disciplinary research area.