Variational Bayes procedure for effective classification of tumor type with microarray gene expression data.

IF 0.4 4区数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY

Statistical Applications in Genetics and Molecular Biology Pub Date : 2012-10-30 DOI:10.1515/1544-6115.1700

Takeshi Hayashi

{"title":"Variational Bayes procedure for effective classification of tumor type with microarray gene expression data.","authors":"Takeshi Hayashi","doi":"10.1515/1544-6115.1700","DOIUrl":null,"url":null,"abstract":"<p><p>Recently, microarrays that can simultaneously measure the expression levels of thousands of genes have become a valuable tool for classifying tumors. For such classification, where the sample size is usually much smaller than the number of genes, it is essential to construct properly sparse models for accurately predicting tumor types to avoid over-fitting. Bayesian shrinkage estimation is considered a suitable method for providing such sparse models, effectively shrinking estimates of the effects for many irrelevant genes to zero while maintaining those of a small number of relevant genes at significant magnitudes. However, Bayesian analysis usually requires time-consuming computational techniques such as computationally intensive MCMC iterations. This paper describes a computationally effective method of Bayesian shrinkage regression (BSR) incorporating multiple hierarchical structures for constructing a classification model for tumor types using microarray gene expression data. We use a variational approximation method which provides simple approximations of posterior distributions of parameters to reduce computational burden in the Bayesian estimation. This computationally efficient BSR procedure yields a properly sparse model for accurately and rapidly classifying tumor samples. The accuracy of tumor classification is shown to be at least equivalent to that of other methods such as support vector machine and partial least squares using simulated and actual gene expression data sets.</p>","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"11 5","pages":"Article 9"},"PeriodicalIF":0.4000,"publicationDate":"2012-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/1544-6115.1700","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Applications in Genetics and Molecular Biology","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1515/1544-6115.1700","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, microarrays that can simultaneously measure the expression levels of thousands of genes have become a valuable tool for classifying tumors. For such classification, where the sample size is usually much smaller than the number of genes, it is essential to construct properly sparse models for accurately predicting tumor types to avoid over-fitting. Bayesian shrinkage estimation is considered a suitable method for providing such sparse models, effectively shrinking estimates of the effects for many irrelevant genes to zero while maintaining those of a small number of relevant genes at significant magnitudes. However, Bayesian analysis usually requires time-consuming computational techniques such as computationally intensive MCMC iterations. This paper describes a computationally effective method of Bayesian shrinkage regression (BSR) incorporating multiple hierarchical structures for constructing a classification model for tumor types using microarray gene expression data. We use a variational approximation method which provides simple approximations of posterior distributions of parameters to reduce computational burden in the Bayesian estimation. This computationally efficient BSR procedure yields a properly sparse model for accurately and rapidly classifying tumor samples. The accuracy of tumor classification is shown to be at least equivalent to that of other methods such as support vector machine and partial least squares using simulated and actual gene expression data sets.

查看原文本刊更多论文

利用微阵列基因表达数据有效分类肿瘤类型的变分贝叶斯方法。

最近，可以同时测量数千个基因表达水平的微阵列已经成为分类肿瘤的一个有价值的工具。对于此类分类，样本量通常远小于基因数量，因此必须构建适当的稀疏模型来准确预测肿瘤类型，以避免过拟合。贝叶斯收缩估计被认为是提供这种稀疏模型的合适方法，它有效地将许多不相关基因的影响估计缩小到零，同时将少数相关基因的影响估计保持在显著的幅度上。然而，贝叶斯分析通常需要耗时的计算技术，例如计算密集的MCMC迭代。本文描述了一种计算有效的贝叶斯收缩回归(BSR)方法，该方法结合了多个层次结构，用于利用微阵列基因表达数据构建肿瘤类型分类模型。在贝叶斯估计中，我们使用了一种变分近似方法，该方法提供了参数后验分布的简单近似，以减少计算量。这种计算效率高的BSR程序产生了一个适当的稀疏模型，用于准确和快速地分类肿瘤样本。使用模拟和实际的基因表达数据集，肿瘤分类的准确性至少相当于其他方法，如支持向量机和偏最小二乘。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Statistical Applications in Genetics and Molecular Biology BIOCHEMISTRY & MOLECULAR BIOLOGY-MATHEMATICAL & COMPUTATIONAL BIOLOGY

自引率

11.10%

发文量

期刊介绍： Statistical Applications in Genetics and Molecular Biology seeks to publish significant research on the application of statistical ideas to problems arising from computational biology. The focus of the papers should be on the relevant statistical issues but should contain a succinct description of the relevant biological problem being considered. The range of topics is wide and will include topics such as linkage mapping, association studies, gene finding and sequence alignment, protein structure prediction, design and analysis of microarray data, molecular evolution and phylogenetic trees, DNA topology, and data base search strategies. Both original research and review articles will be warmly received.