一种用于亚型识别的贝叶斯半参数因子分析模型。

Pub Date : 2017-04-25 DOI:10.1515/sagmb-2016-0051
Jiehuan Sun, Joshua L Warren, Hongyu Zhao
{"title":"一种用于亚型识别的贝叶斯半参数因子分析模型。","authors":"Jiehuan Sun,&nbsp;Joshua L Warren,&nbsp;Hongyu Zhao","doi":"10.1515/sagmb-2016-0051","DOIUrl":null,"url":null,"abstract":"<p><p>Disease subtype identification (clustering) is an important problem in biomedical research. Gene expression profiles are commonly utilized to infer disease subtypes, which often lead to biologically meaningful insights into disease. Despite many successes, existing clustering methods may not perform well when genes are highly correlated and many uninformative genes are included for clustering due to the high dimensionality. In this article, we introduce a novel subtype identification method in the Bayesian setting based on gene expression profiles. This method, called BCSub, adopts an innovative semiparametric Bayesian factor analysis model to reduce the dimension of the data to a few factor scores for clustering. Specifically, the factor scores are assumed to follow the Dirichlet process mixture model in order to induce clustering. Through extensive simulation studies, we show that BCSub has improved performance over commonly used clustering methods. When applied to two gene expression datasets, our model is able to identify subtypes that are clinically more relevant than those identified from the existing methods.</p>","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2016-0051","citationCount":"0","resultStr":"{\"title\":\"A Bayesian semiparametric factor analysis model for subtype identification.\",\"authors\":\"Jiehuan Sun,&nbsp;Joshua L Warren,&nbsp;Hongyu Zhao\",\"doi\":\"10.1515/sagmb-2016-0051\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Disease subtype identification (clustering) is an important problem in biomedical research. Gene expression profiles are commonly utilized to infer disease subtypes, which often lead to biologically meaningful insights into disease. Despite many successes, existing clustering methods may not perform well when genes are highly correlated and many uninformative genes are included for clustering due to the high dimensionality. In this article, we introduce a novel subtype identification method in the Bayesian setting based on gene expression profiles. This method, called BCSub, adopts an innovative semiparametric Bayesian factor analysis model to reduce the dimension of the data to a few factor scores for clustering. Specifically, the factor scores are assumed to follow the Dirichlet process mixture model in order to induce clustering. Through extensive simulation studies, we show that BCSub has improved performance over commonly used clustering methods. When applied to two gene expression datasets, our model is able to identify subtypes that are clinically more relevant than those identified from the existing methods.</p>\",\"PeriodicalId\":0,\"journal\":{\"name\":\"\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0,\"publicationDate\":\"2017-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1515/sagmb-2016-0051\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1515/sagmb-2016-0051\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1515/sagmb-2016-0051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

疾病亚型识别(聚类)是生物医学研究中的一个重要问题。基因表达谱通常用于推断疾病亚型,这通常会导致对疾病有生物学意义的见解。尽管已有的聚类方法取得了许多成功,但当基因高度相关且由于高维数而包含许多无信息的基因时,现有的聚类方法可能表现不佳。在本文中,我们介绍了一种新的基于基因表达谱的贝叶斯亚型鉴定方法。这种方法被称为BCSub,它采用一种创新的半参数贝叶斯因子分析模型,将数据的维数降至几个因子得分进行聚类。具体来说,为了诱导聚类,假设因子得分遵循Dirichlet过程混合模型。通过大量的仿真研究,我们表明BCSub比常用的聚类方法具有更高的性能。当应用于两个基因表达数据集时,我们的模型能够识别出比现有方法更具有临床相关性的亚型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

A Bayesian semiparametric factor analysis model for subtype identification.

A Bayesian semiparametric factor analysis model for subtype identification.

分享
查看原文
A Bayesian semiparametric factor analysis model for subtype identification.

Disease subtype identification (clustering) is an important problem in biomedical research. Gene expression profiles are commonly utilized to infer disease subtypes, which often lead to biologically meaningful insights into disease. Despite many successes, existing clustering methods may not perform well when genes are highly correlated and many uninformative genes are included for clustering due to the high dimensionality. In this article, we introduce a novel subtype identification method in the Bayesian setting based on gene expression profiles. This method, called BCSub, adopts an innovative semiparametric Bayesian factor analysis model to reduce the dimension of the data to a few factor scores for clustering. Specifically, the factor scores are assumed to follow the Dirichlet process mixture model in order to induce clustering. Through extensive simulation studies, we show that BCSub has improved performance over commonly used clustering methods. When applied to two gene expression datasets, our model is able to identify subtypes that are clinically more relevant than those identified from the existing methods.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信