Minimum Bayesian error probability-based gene subset selection.

IF 0.4 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI:10.1504/ijdmb.2015.070056

Jian Li, Tian Yu, Jin-Mao Wei

{"title":"Minimum Bayesian error probability-based gene subset selection.","authors":"Jian Li, Tian Yu, Jin-Mao Wei","doi":"10.1504/ijdmb.2015.070056","DOIUrl":null,"url":null,"abstract":"<p><p>Sifting functional genes is crucial to the new strategies for drug discovery and prospective patient-tailored therapy. Generally, simply generating gene subset by selecting the top k individually superior genes may obtain an inferior gene combination, for some selected genes may be redundant with respect to some others. In this paper, we propose to select gene subset based on the criterion of minimum Bayesian error probability. The method dynamically evaluates all available genes and sifts only one gene at a time. A gene is selected if its combination with the other selected genes can gain better classification information. Within the generated gene subset, each individual gene is the most discriminative one in comparison with those that classify cancers in the same way as this gene does and different genes are more discriminative in combination than in individual. The genes selected in this way are likely to be functional ones from the system biology perspective, for genes tend to co-regulate rather than regulate individually. Experimental results show that the classifiers induced based on this method are capable of classifying cancers with high accuracy, while only a small number of genes are involved.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":"12 4","pages":"434-50"},"PeriodicalIF":0.4000,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.070056","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Data Mining and Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1504/ijdmb.2015.070056","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Sifting functional genes is crucial to the new strategies for drug discovery and prospective patient-tailored therapy. Generally, simply generating gene subset by selecting the top k individually superior genes may obtain an inferior gene combination, for some selected genes may be redundant with respect to some others. In this paper, we propose to select gene subset based on the criterion of minimum Bayesian error probability. The method dynamically evaluates all available genes and sifts only one gene at a time. A gene is selected if its combination with the other selected genes can gain better classification information. Within the generated gene subset, each individual gene is the most discriminative one in comparison with those that classify cancers in the same way as this gene does and different genes are more discriminative in combination than in individual. The genes selected in this way are likely to be functional ones from the system biology perspective, for genes tend to co-regulate rather than regulate individually. Experimental results show that the classifiers induced based on this method are capable of classifying cancers with high accuracy, while only a small number of genes are involved.

查看原文本刊更多论文

基于最小贝叶斯误差概率的基因子集选择。

筛选功能基因对于药物发现和前瞻性患者定制治疗的新策略至关重要。一般来说，简单地通过选择前k个单独的优越基因来产生基因子集可能会得到一个较差的基因组合，因为一些被选择的基因可能相对于其他基因是冗余的。在本文中，我们提出了基于最小贝叶斯错误概率准则的基因子集选择。该方法动态评估所有可用的基因，一次只筛选一个基因。如果一个基因与其他被选择基因的组合能获得更好的分类信息，那么这个基因就被选择了。在生成的基因子集中，与那些以相同方式分类癌症的基因相比，每个单独的基因是最具区别性的，不同的基因组合起来比单个基因更具区别性。从系统生物学的角度来看，以这种方式选择的基因很可能是功能性基因，因为基因倾向于共同调控而不是单独调控。实验结果表明，基于该方法诱导的分类器能够以较高的准确率对癌症进行分类，并且只涉及少量基因。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Data Mining and Bioinformatics 生物-数学与计算生物学

CiteScore

1.00

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： Mining bioinformatics data is an emerging area at the intersection between bioinformatics and data mining. The objective of IJDMB is to facilitate collaboration between data mining researchers and bioinformaticians by presenting cutting edge research topics and methodologies in the area of data mining for bioinformatics. This perspective acknowledges the inter-disciplinary nature of research in data mining and bioinformatics and provides a unified forum for researchers/practitioners/students/policy makers to share the latest research and developments in this fast growing multi-disciplinary research area.