聚类表达基因的基础上，他们与定量表型的关联。

Genetical research Pub Date : 2005-12-01 DOI:10.1017/S0016672305007822

Zhenyu Jia, Shizhong Xu

{"title":"聚类表达基因的基础上，他们与定量表型的关联。","authors":"Zhenyu Jia, Shizhong Xu","doi":"10.1017/S0016672305007822","DOIUrl":null,"url":null,"abstract":"Cluster analyses of gene expression data are usually conducted based on their associations with the phenotype of a particular disease. Many disease traits have a clearly defined binary phenotype (presence or absence), so that genes can be clustered based on the differences of expression levels between the two contrasting phenotypic groups. For example, cluster analysis based on binary phenotype has been successfully used in tumour research. Some complex diseases have phenotypes that vary in a continuous manner and the method developed for a binary trait is not immediately applicable to a continuous trait. However, understanding the role of gene expression in these complex traits is of fundamental importance. Therefore, it is necessary to develop a new statistical method to cluster expressed genes based on their association with a quantitative trait phenotype. We developed a model-based clustering method to classify genes based on their association with a continuous phenotype. We used a linear model to describe the relationship between gene expression and the phenotypic value. The model effects of the linear model (linear regression coefficients) represent the strength of the association. We assumed that the model effects of each gene follow a mixture of several multivariate Gaussian distributions. Parameter estimation and cluster assignment were accomplished via an Expectation-Maximization (EM) algorithm. The method was verified by analysing two simulated datasets, and further demonstrated using real data generated in a microarray experiment for the study of gene expression associated with Alzheimer's disease.","PeriodicalId":12777,"journal":{"name":"Genetical research","volume":"86 3","pages":"193-207"},"PeriodicalIF":0.0000,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1017/S0016672305007822","citationCount":"38","resultStr":"{\"title\":\"Clustering expressed genes on the basis of their association with a quantitative phenotype.\",\"authors\":\"Zhenyu Jia, Shizhong Xu\",\"doi\":\"10.1017/S0016672305007822\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cluster analyses of gene expression data are usually conducted based on their associations with the phenotype of a particular disease. Many disease traits have a clearly defined binary phenotype (presence or absence), so that genes can be clustered based on the differences of expression levels between the two contrasting phenotypic groups. For example, cluster analysis based on binary phenotype has been successfully used in tumour research. Some complex diseases have phenotypes that vary in a continuous manner and the method developed for a binary trait is not immediately applicable to a continuous trait. However, understanding the role of gene expression in these complex traits is of fundamental importance. Therefore, it is necessary to develop a new statistical method to cluster expressed genes based on their association with a quantitative trait phenotype. We developed a model-based clustering method to classify genes based on their association with a continuous phenotype. We used a linear model to describe the relationship between gene expression and the phenotypic value. The model effects of the linear model (linear regression coefficients) represent the strength of the association. We assumed that the model effects of each gene follow a mixture of several multivariate Gaussian distributions. Parameter estimation and cluster assignment were accomplished via an Expectation-Maximization (EM) algorithm. The method was verified by analysing two simulated datasets, and further demonstrated using real data generated in a microarray experiment for the study of gene expression associated with Alzheimer's disease.\",\"PeriodicalId\":12777,\"journal\":{\"name\":\"Genetical research\",\"volume\":\"86 3\",\"pages\":\"193-207\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1017/S0016672305007822\",\"citationCount\":\"38\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genetical research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1017/S0016672305007822\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetical research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1017/S0016672305007822","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 38

摘要

基因表达数据的聚类分析通常基于它们与特定疾病表型的关联进行。许多疾病特征具有明确定义的二元表型(存在或不存在)，因此可以根据两种不同表型组之间表达水平的差异对基因进行聚类。例如，基于二元表型的聚类分析已成功地用于肿瘤研究。一些复杂疾病的表型以连续的方式变化，为二元性状开发的方法不能立即适用于连续性状。然而，了解基因表达在这些复杂性状中的作用是至关重要的。因此，有必要开发一种新的统计方法，根据表达基因与数量性状表型的关联对表达基因进行聚类。我们开发了一种基于模型的聚类方法，根据基因与连续表型的关联对基因进行分类。我们使用线性模型来描述基因表达与表型值之间的关系。线性模型的模型效应(线性回归系数)表示关联的强度。我们假设每个基因的模型效应遵循几个多元高斯分布的混合。通过期望最大化算法实现参数估计和聚类分配。通过分析两个模拟数据集验证了该方法，并使用微阵列实验中产生的与阿尔茨海默病相关的基因表达研究的真实数据进一步验证了该方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Clustering expressed genes on the basis of their association with a quantitative phenotype.

Cluster analyses of gene expression data are usually conducted based on their associations with the phenotype of a particular disease. Many disease traits have a clearly defined binary phenotype (presence or absence), so that genes can be clustered based on the differences of expression levels between the two contrasting phenotypic groups. For example, cluster analysis based on binary phenotype has been successfully used in tumour research. Some complex diseases have phenotypes that vary in a continuous manner and the method developed for a binary trait is not immediately applicable to a continuous trait. However, understanding the role of gene expression in these complex traits is of fundamental importance. Therefore, it is necessary to develop a new statistical method to cluster expressed genes based on their association with a quantitative trait phenotype. We developed a model-based clustering method to classify genes based on their association with a continuous phenotype. We used a linear model to describe the relationship between gene expression and the phenotypic value. The model effects of the linear model (linear regression coefficients) represent the strength of the association. We assumed that the model effects of each gene follow a mixture of several multivariate Gaussian distributions. Parameter estimation and cluster assignment were accomplished via an Expectation-Maximization (EM) algorithm. The method was verified by analysing two simulated datasets, and further demonstrated using real data generated in a microarray experiment for the study of gene expression associated with Alzheimer's disease.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Genetical research

自引率

0.00%

发文量