因果基因组和表观基因组网络分析是新一代复杂疾病的遗传研究

M. Xiong
{"title":"因果基因组和表观基因组网络分析是新一代复杂疾病的遗传研究","authors":"M. Xiong","doi":"10.4172/2329-9002.1000e113","DOIUrl":null,"url":null,"abstract":"In the past decade, rapid advances in genomic technologies have dramatically changed the genetic studies of complex diseases. Genome-wide association studies (GWAS) have been widely used in dissecting genetic structure of complex diseases. As of December 18th, 2014, A Catalog of Published Genome-Wide Association Studies (GWAS) had reported significant association of 15,177 SNPs with more than 700 traits in 2,087 publications [1]. However, numerous studies reported that the genetic loci identified by GWAS collectively explain only < 10% of genetic variation across the population in most complex diseases. About 90% of the heritability of common diseases are unexplained by a large number of identified GWA loci. Each variant usually has weak effect and make small and mild contributions to the disease. More than 1,000 loci for many complex diseases have been identified [2]. Although extremely large number of samples are collected and whole genome sequencing studies will be conducted very soon, which will lead to reducing he fraction of missing heritability, a large proportion of heritability will be still missing under the paradigm of single trait genetic analysis. The methods for heritability estimation and single trait genetic study paradigm are questionable. \n \nA biological system consists of multiple phenotypes. The multiple phenotypes are correlated. It has been reported that more than 4.6% of the SNPs and 16.9% of the genes in previous genome-wide association studies (GWAS) were significantly associated with more than one trait [3]. These results demonstrate that genetic pleiotropic effects likely play a crucial role in the molecular basis of correlated phenotype [4]. The heritability of individual phenotype cannot reveal complicated genotype-phenotype structure and is highly unlikely to fully capture the structure of heritability of multiple phenotypes. Furthermore, the estimation of heritability by a single trait approach might be inaccurate. The concept of heritability should be extended from a single trait to multiple traits. \n \nConsider k traits. The breeding and phenotype values for k traits are denoted by a k dimensional vector \n \n \nA= [A1,…, Ak] and P= [P1,… P,k]T, respectively. A breeding equation is given by \n \n \nA=HP \n \n(1) \n \n \n \n \nWhere H is a heritability matrix and denoted by \n \n \nH=[h1⋯h1k⋮⋱⋮hk1⋯hk] \n \n \n \n \nSuppose that the phenotype can) be decomposed as a summation of additive effect, dominant effect and environment effect:k \n \n \nP=A+D+E,where \n \n(2) \n \n \n \n \nA, D and E represent the genetic additive, dominant and environmental effect, respectively. Denote the covariance matrix between the breeding value and phenotype values by \n \n \ncov(A,P)=[cov(A1,P1)⋯cov(A1,Pk)⋮⋱⋮cov(Ak,P1)⋮cov(Ak,Pk)] \n \n \nand variance-covariance matrix of the phenotype P by \n \n \nvar(P)=[var(P1)⋯cov(P1,Pk)⋮⋱⋮cov(Pk,P1)⋯var(Pk)] \n \n \n \n \nIt is known that \n \n \ncov(Ai,Pj)=cov(Ai,Aj)+cov(Ai,Dj)+cov(Ai,Ej), \n \n \nwhich implies that \n \n \ncov(A,P)=[cov(A1,A1)+cov(A1,D1)+cov(A1,E1)⋯cov(A1,Ak)+cov(A1,Dk)+cov(A1,Ek)⋮⋱⋮cov(Ak,A1)+cov(Ak,D1)+cov(Ak,E1)⋮cov(Ak,Ak)+cov(Ak,Dk)+cov(Ak,Ek)] \n \n \n \n \nIt follows from equation (1) that the heritability matrix is estimated by \n \n \nH=COV(A,P)[var(P)]−1 \n \n(3) \n \n \n \nEquation (3) shows that the heritability of the ith trait hii is a function of the genetic covariance between the ith trait and other traits. In other words, the heritability of each trait is influenced by its correlation with other multiple traits. This clearly demonstrates that the trait by trait genetic study will overlook the influence of other traits. The missing heritability may be due to trait by trait genetic analysis. The joint genetic analysis of multiple traits may increase the heritability. \n \nThere has been increasingly consensus that individual genetic and epigenetic variants, individual genes, individual linear pathway and individual trait analysis cannot capture the intrinsic genetic and epigenetic complexity of multiple phenotypes. \n \nTo completely capture the heritability, the right research direction is to jointly investigate genetic, expression, miRNA, epigenetic, metabolic variants, physiological traits, medical imaging measurements and environments in multiple traits which are often interactively organized networks. Integrative analysis of genetic, epigenetic, imaging and environmental variation in multiple phenotypes will fully uncover the heritability and facilitate the understanding the mechanism of the complex diseases. The popular methods for integrative analysis are mainly based on correlation and association analysis. These methods cannot efficiently detect, distinguish and characterize the true biological, mediated and spurious pleiotropic effects. Therefore, these approaches may not provide clear biologically or clinical relevant information that allows the mechanisms of genetic effects to be discovered and understood. To overcome these limitations, developing a new framework and novel statistical methods for inferring causal networks of genotype-phenotypes with NGS data and detecting, distinguishing and characterizing the true biological pleiotropic, mediated pleiotropic and spurious pleiotropic effects of genetic variants are urgently needed. \n \nAn essential issue for using causal graphs to study genetics of multiple phenotypes is how to accurately and efficiently estimate the structure of causal graph from observational data. Structure learning of casual graphs has been shown to be NP-hard. Early methods for structure learning mainly focused on approximation algorithms, but such methods are unable to ensure the generation of the true causal graph. To obtain the causal graph from observation data as close to the biological causal graph as possible, “score and search”-based methods for exact learning causal graphs of genotype-phenotype to find the best-scoring structures for a given dataset are being developed. The accurate and robust estimation of the genotype-phenotype causal networks by the “score and search” methods will shift the paradigm of genetic studies of correlated multiple phenotypes from association analysis to causal inference, and dramatically facilitate discovery of the mechanism underlying multiple traits. \n \nAlthough their application to genome-wide genotype-phenotype network construction is difficult due to computational limitations, the “score and search” based causal inference methods are suitable to the phenome-wide association studies where starting phenomics, defined as the unbiased study of a large number of phenotypes in a population. We study the complex networks between multiple expressed phenotypes and genetic variants. Since the number of genetic variants in the phenome-wide association is quite limited and hence the size of the genotype-phenotype network is limited, the required computational time of construction of genotype-phenotype networks using causal inference is in the range the current computer system can reach. Advances in biosensors and sequencing technologies generate large amounts of phenotype and genetic data. Causal genetic and epigenetic network analysis may emerge as a new paradigm of genetic studies of complex traits. The main purpose of this editorial is to stimulate discussion about what are the optimal strategies to facilitate the development of a new generation of genetic analysis. I hope that more and more real data analysis in the future will greatly increase the confidence in causal inference for genotype-phenotype studies.","PeriodicalId":89991,"journal":{"name":"Journal of phylogenetics & evolutionary biology","volume":"3 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2013-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4172/2329-9002.1000e113","citationCount":"0","resultStr":"{\"title\":\"Causal Genomic and Epigenomic Network Analysis emerges as a New Generation of Genetic Studies of Complex Diseases\",\"authors\":\"M. Xiong\",\"doi\":\"10.4172/2329-9002.1000e113\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the past decade, rapid advances in genomic technologies have dramatically changed the genetic studies of complex diseases. Genome-wide association studies (GWAS) have been widely used in dissecting genetic structure of complex diseases. As of December 18th, 2014, A Catalog of Published Genome-Wide Association Studies (GWAS) had reported significant association of 15,177 SNPs with more than 700 traits in 2,087 publications [1]. However, numerous studies reported that the genetic loci identified by GWAS collectively explain only < 10% of genetic variation across the population in most complex diseases. About 90% of the heritability of common diseases are unexplained by a large number of identified GWA loci. Each variant usually has weak effect and make small and mild contributions to the disease. More than 1,000 loci for many complex diseases have been identified [2]. Although extremely large number of samples are collected and whole genome sequencing studies will be conducted very soon, which will lead to reducing he fraction of missing heritability, a large proportion of heritability will be still missing under the paradigm of single trait genetic analysis. The methods for heritability estimation and single trait genetic study paradigm are questionable. \\n \\nA biological system consists of multiple phenotypes. The multiple phenotypes are correlated. It has been reported that more than 4.6% of the SNPs and 16.9% of the genes in previous genome-wide association studies (GWAS) were significantly associated with more than one trait [3]. These results demonstrate that genetic pleiotropic effects likely play a crucial role in the molecular basis of correlated phenotype [4]. The heritability of individual phenotype cannot reveal complicated genotype-phenotype structure and is highly unlikely to fully capture the structure of heritability of multiple phenotypes. Furthermore, the estimation of heritability by a single trait approach might be inaccurate. The concept of heritability should be extended from a single trait to multiple traits. \\n \\nConsider k traits. The breeding and phenotype values for k traits are denoted by a k dimensional vector \\n \\n \\nA= [A1,…, Ak] and P= [P1,… P,k]T, respectively. A breeding equation is given by \\n \\n \\nA=HP \\n \\n(1) \\n \\n \\n \\n \\nWhere H is a heritability matrix and denoted by \\n \\n \\nH=[h1⋯h1k⋮⋱⋮hk1⋯hk] \\n \\n \\n \\n \\nSuppose that the phenotype can) be decomposed as a summation of additive effect, dominant effect and environment effect:k \\n \\n \\nP=A+D+E,where \\n \\n(2) \\n \\n \\n \\n \\nA, D and E represent the genetic additive, dominant and environmental effect, respectively. Denote the covariance matrix between the breeding value and phenotype values by \\n \\n \\ncov(A,P)=[cov(A1,P1)⋯cov(A1,Pk)⋮⋱⋮cov(Ak,P1)⋮cov(Ak,Pk)] \\n \\n \\nand variance-covariance matrix of the phenotype P by \\n \\n \\nvar(P)=[var(P1)⋯cov(P1,Pk)⋮⋱⋮cov(Pk,P1)⋯var(Pk)] \\n \\n \\n \\n \\nIt is known that \\n \\n \\ncov(Ai,Pj)=cov(Ai,Aj)+cov(Ai,Dj)+cov(Ai,Ej), \\n \\n \\nwhich implies that \\n \\n \\ncov(A,P)=[cov(A1,A1)+cov(A1,D1)+cov(A1,E1)⋯cov(A1,Ak)+cov(A1,Dk)+cov(A1,Ek)⋮⋱⋮cov(Ak,A1)+cov(Ak,D1)+cov(Ak,E1)⋮cov(Ak,Ak)+cov(Ak,Dk)+cov(Ak,Ek)] \\n \\n \\n \\n \\nIt follows from equation (1) that the heritability matrix is estimated by \\n \\n \\nH=COV(A,P)[var(P)]−1 \\n \\n(3) \\n \\n \\n \\nEquation (3) shows that the heritability of the ith trait hii is a function of the genetic covariance between the ith trait and other traits. In other words, the heritability of each trait is influenced by its correlation with other multiple traits. This clearly demonstrates that the trait by trait genetic study will overlook the influence of other traits. The missing heritability may be due to trait by trait genetic analysis. The joint genetic analysis of multiple traits may increase the heritability. \\n \\nThere has been increasingly consensus that individual genetic and epigenetic variants, individual genes, individual linear pathway and individual trait analysis cannot capture the intrinsic genetic and epigenetic complexity of multiple phenotypes. \\n \\nTo completely capture the heritability, the right research direction is to jointly investigate genetic, expression, miRNA, epigenetic, metabolic variants, physiological traits, medical imaging measurements and environments in multiple traits which are often interactively organized networks. Integrative analysis of genetic, epigenetic, imaging and environmental variation in multiple phenotypes will fully uncover the heritability and facilitate the understanding the mechanism of the complex diseases. The popular methods for integrative analysis are mainly based on correlation and association analysis. These methods cannot efficiently detect, distinguish and characterize the true biological, mediated and spurious pleiotropic effects. Therefore, these approaches may not provide clear biologically or clinical relevant information that allows the mechanisms of genetic effects to be discovered and understood. To overcome these limitations, developing a new framework and novel statistical methods for inferring causal networks of genotype-phenotypes with NGS data and detecting, distinguishing and characterizing the true biological pleiotropic, mediated pleiotropic and spurious pleiotropic effects of genetic variants are urgently needed. \\n \\nAn essential issue for using causal graphs to study genetics of multiple phenotypes is how to accurately and efficiently estimate the structure of causal graph from observational data. Structure learning of casual graphs has been shown to be NP-hard. Early methods for structure learning mainly focused on approximation algorithms, but such methods are unable to ensure the generation of the true causal graph. To obtain the causal graph from observation data as close to the biological causal graph as possible, “score and search”-based methods for exact learning causal graphs of genotype-phenotype to find the best-scoring structures for a given dataset are being developed. The accurate and robust estimation of the genotype-phenotype causal networks by the “score and search” methods will shift the paradigm of genetic studies of correlated multiple phenotypes from association analysis to causal inference, and dramatically facilitate discovery of the mechanism underlying multiple traits. \\n \\nAlthough their application to genome-wide genotype-phenotype network construction is difficult due to computational limitations, the “score and search” based causal inference methods are suitable to the phenome-wide association studies where starting phenomics, defined as the unbiased study of a large number of phenotypes in a population. We study the complex networks between multiple expressed phenotypes and genetic variants. Since the number of genetic variants in the phenome-wide association is quite limited and hence the size of the genotype-phenotype network is limited, the required computational time of construction of genotype-phenotype networks using causal inference is in the range the current computer system can reach. Advances in biosensors and sequencing technologies generate large amounts of phenotype and genetic data. Causal genetic and epigenetic network analysis may emerge as a new paradigm of genetic studies of complex traits. The main purpose of this editorial is to stimulate discussion about what are the optimal strategies to facilitate the development of a new generation of genetic analysis. I hope that more and more real data analysis in the future will greatly increase the confidence in causal inference for genotype-phenotype studies.\",\"PeriodicalId\":89991,\"journal\":{\"name\":\"Journal of phylogenetics & evolutionary biology\",\"volume\":\"3 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.4172/2329-9002.1000e113\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of phylogenetics & evolutionary biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4172/2329-9002.1000e113\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of phylogenetics & evolutionary biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4172/2329-9002.1000e113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在过去的十年中,基因组技术的快速发展极大地改变了复杂疾病的基因研究。全基因组关联研究(GWAS)已广泛应用于复杂疾病的遗传结构分析。截至2014年12月18日,出版的全基因组关联研究目录(A Catalog of Published Genome-Wide Association Studies, GWAS)在2087份出版物中报道了15,177个snp与700多个性状的显著关联[10]。然而,许多研究报道,在大多数复杂疾病中,GWAS鉴定的遗传位点只能解释人群中小于10%的遗传变异。约90%的常见病的遗传性是由大量已鉴定的GWA位点无法解释的。每一种变异的作用通常都很弱,对疾病的贡献很小,也很轻微。目前已经发现了许多复杂疾病的1000多个基因座。虽然采集的样本数量极大,全基因组测序研究很快就会开展,这将导致缺失遗传力的比例减少,但在单性状遗传分析范式下,仍然会有很大比例的遗传力缺失。遗传力估计方法和单性状遗传研究范式存在问题。一个生物系统由多种表型组成。多种表型是相关的。据报道,在以往的全基因组关联研究(GWAS)中,超过4.6%的snp和16.9%的基因与一个以上的性状显著相关。这些结果表明,遗传多效效应可能在相关表型[4]的分子基础中起着至关重要的作用。单个表型的遗传力不能揭示复杂的基因型-表型结构,也不太可能完全捕捉到多表型的遗传力结构。此外,单性状方法估计遗传力可能是不准确的。遗传力的概念应从单一性状扩展到多性状。考虑k个特点。k个性状的育种值和表型值分别用k维向量a = [A1,…,Ak]和P= [P1,…P,k]T表示。育种方程由A=HP(1)给出,其中H是遗传力矩阵,表示为H=[h1⋯h1k hk1⋯hk],假设表型可以分解为加性效应、显性效应和环境效应的总和:k P=A+D+E,其中(2)A、D和E分别代表遗传加性效应、显性效应和环境效应。用cov(A,P)=[cov(A1,P1)⋯cov(A1,Pk)]表示育种值与表型值之间的协方差矩阵;用var(P)=[var(P1)⋯cov(P1));用var(P)=[var(P1)⋯cov(P1)⋯var(Pk)]表示表现型P的方差协方差矩阵。这意味着cov(A,P)=[cov(A1,A1)+cov(A1,D1)+cov(A1,E1)⋯⋯cov(A1,Ak)+cov(A1,Dk)+cov(A1,Ek)]。从公式(1)可知,遗传力矩阵由H= cov(A,P) [var(P)] - 1(3)可知,第i个性状hii的遗传力是第i个性状与其他性状之间遗传协方差的函数。换句话说,每个性状的遗传力受到其与其他多性状的相关性的影响。这清楚地表明,单个性状的遗传研究将忽略其他性状的影响。遗传力缺失可能是由于性状间的遗传分析。多性状联合遗传分析可提高遗传力。个体遗传和表观遗传变异、个体基因、个体线性通路和个体性状分析无法捕捉多种表型内在的遗传和表观遗传复杂性,这一观点已得到越来越多的共识。要全面掌握遗传力,正确的研究方向是共同研究多性状的遗传、表达、miRNA、表观遗传、代谢变异、生理性状、医学影像学测量和环境等往往是相互作用组织的网络。综合分析多种表型的遗传、表观遗传、影像学和环境变异,将充分揭示其遗传力,有助于理解复杂疾病的发病机制。目前流行的综合分析方法主要是基于相关分析和关联分析。这些方法不能有效地检测、区分和表征真生物、介导和虚假的多效性效应。因此,这些方法可能无法提供明确的生物学或临床相关信息,从而无法发现和理解遗传效应的机制。 为了克服这些局限性,迫切需要开发一个新的框架和新的统计方法,利用NGS数据推断基因型-表型的因果网络,并检测、区分和表征遗传变异的真生物多效性、介导多效性和伪多效性。利用因果图研究多表型遗传的一个关键问题是如何从观测数据中准确有效地估计因果图的结构。随机图的结构学习已被证明是np困难的。早期的结构学习方法主要集中在近似算法上,但这种方法不能保证生成真实的因果图。为了从观察数据中获得尽可能接近生物学因果图的因果图,基于“得分和搜索”的方法正在开发用于精确学习基因型-表型因果图的方法,以找到给定数据集的最佳得分结构。通过“评分和搜索”方法对基因型-表型因果网络进行准确、稳健的估计,将使相关多表型的遗传研究范式从关联分析转向因果推理,并极大地促进了多性状机制的发现。尽管由于计算的限制,它们在全基因组基因型-表型网络构建中的应用很困难,但基于“评分和搜索”的因果推理方法适用于从表型组学开始的全表型关联研究,表型组学被定义为对群体中大量表型的无偏研究。我们研究多种表达表型和遗传变异之间的复杂网络。由于全表型关联中的遗传变异数量非常有限,因此基因型-表型网络的规模有限,因此使用因果推理构建基因型-表型网络所需的计算时间在当前计算机系统可以达到的范围内。生物传感器和测序技术的进步产生了大量的表型和遗传数据。因果遗传和表观遗传网络分析可能成为复杂性状遗传研究的新范式。这篇社论的主要目的是激发关于什么是促进新一代遗传分析发展的最佳策略的讨论。我希望未来越来越多的真实数据分析将大大增加对基因型-表型研究因果推理的信心。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Causal Genomic and Epigenomic Network Analysis emerges as a New Generation of Genetic Studies of Complex Diseases
In the past decade, rapid advances in genomic technologies have dramatically changed the genetic studies of complex diseases. Genome-wide association studies (GWAS) have been widely used in dissecting genetic structure of complex diseases. As of December 18th, 2014, A Catalog of Published Genome-Wide Association Studies (GWAS) had reported significant association of 15,177 SNPs with more than 700 traits in 2,087 publications [1]. However, numerous studies reported that the genetic loci identified by GWAS collectively explain only < 10% of genetic variation across the population in most complex diseases. About 90% of the heritability of common diseases are unexplained by a large number of identified GWA loci. Each variant usually has weak effect and make small and mild contributions to the disease. More than 1,000 loci for many complex diseases have been identified [2]. Although extremely large number of samples are collected and whole genome sequencing studies will be conducted very soon, which will lead to reducing he fraction of missing heritability, a large proportion of heritability will be still missing under the paradigm of single trait genetic analysis. The methods for heritability estimation and single trait genetic study paradigm are questionable. A biological system consists of multiple phenotypes. The multiple phenotypes are correlated. It has been reported that more than 4.6% of the SNPs and 16.9% of the genes in previous genome-wide association studies (GWAS) were significantly associated with more than one trait [3]. These results demonstrate that genetic pleiotropic effects likely play a crucial role in the molecular basis of correlated phenotype [4]. The heritability of individual phenotype cannot reveal complicated genotype-phenotype structure and is highly unlikely to fully capture the structure of heritability of multiple phenotypes. Furthermore, the estimation of heritability by a single trait approach might be inaccurate. The concept of heritability should be extended from a single trait to multiple traits. Consider k traits. The breeding and phenotype values for k traits are denoted by a k dimensional vector A= [A1,…, Ak] and P= [P1,… P,k]T, respectively. A breeding equation is given by A=HP (1) Where H is a heritability matrix and denoted by H=[h1⋯h1k⋮⋱⋮hk1⋯hk] Suppose that the phenotype can) be decomposed as a summation of additive effect, dominant effect and environment effect:k P=A+D+E,where (2) A, D and E represent the genetic additive, dominant and environmental effect, respectively. Denote the covariance matrix between the breeding value and phenotype values by cov(A,P)=[cov(A1,P1)⋯cov(A1,Pk)⋮⋱⋮cov(Ak,P1)⋮cov(Ak,Pk)] and variance-covariance matrix of the phenotype P by var(P)=[var(P1)⋯cov(P1,Pk)⋮⋱⋮cov(Pk,P1)⋯var(Pk)] It is known that cov(Ai,Pj)=cov(Ai,Aj)+cov(Ai,Dj)+cov(Ai,Ej), which implies that cov(A,P)=[cov(A1,A1)+cov(A1,D1)+cov(A1,E1)⋯cov(A1,Ak)+cov(A1,Dk)+cov(A1,Ek)⋮⋱⋮cov(Ak,A1)+cov(Ak,D1)+cov(Ak,E1)⋮cov(Ak,Ak)+cov(Ak,Dk)+cov(Ak,Ek)] It follows from equation (1) that the heritability matrix is estimated by H=COV(A,P)[var(P)]−1 (3) Equation (3) shows that the heritability of the ith trait hii is a function of the genetic covariance between the ith trait and other traits. In other words, the heritability of each trait is influenced by its correlation with other multiple traits. This clearly demonstrates that the trait by trait genetic study will overlook the influence of other traits. The missing heritability may be due to trait by trait genetic analysis. The joint genetic analysis of multiple traits may increase the heritability. There has been increasingly consensus that individual genetic and epigenetic variants, individual genes, individual linear pathway and individual trait analysis cannot capture the intrinsic genetic and epigenetic complexity of multiple phenotypes. To completely capture the heritability, the right research direction is to jointly investigate genetic, expression, miRNA, epigenetic, metabolic variants, physiological traits, medical imaging measurements and environments in multiple traits which are often interactively organized networks. Integrative analysis of genetic, epigenetic, imaging and environmental variation in multiple phenotypes will fully uncover the heritability and facilitate the understanding the mechanism of the complex diseases. The popular methods for integrative analysis are mainly based on correlation and association analysis. These methods cannot efficiently detect, distinguish and characterize the true biological, mediated and spurious pleiotropic effects. Therefore, these approaches may not provide clear biologically or clinical relevant information that allows the mechanisms of genetic effects to be discovered and understood. To overcome these limitations, developing a new framework and novel statistical methods for inferring causal networks of genotype-phenotypes with NGS data and detecting, distinguishing and characterizing the true biological pleiotropic, mediated pleiotropic and spurious pleiotropic effects of genetic variants are urgently needed. An essential issue for using causal graphs to study genetics of multiple phenotypes is how to accurately and efficiently estimate the structure of causal graph from observational data. Structure learning of casual graphs has been shown to be NP-hard. Early methods for structure learning mainly focused on approximation algorithms, but such methods are unable to ensure the generation of the true causal graph. To obtain the causal graph from observation data as close to the biological causal graph as possible, “score and search”-based methods for exact learning causal graphs of genotype-phenotype to find the best-scoring structures for a given dataset are being developed. The accurate and robust estimation of the genotype-phenotype causal networks by the “score and search” methods will shift the paradigm of genetic studies of correlated multiple phenotypes from association analysis to causal inference, and dramatically facilitate discovery of the mechanism underlying multiple traits. Although their application to genome-wide genotype-phenotype network construction is difficult due to computational limitations, the “score and search” based causal inference methods are suitable to the phenome-wide association studies where starting phenomics, defined as the unbiased study of a large number of phenotypes in a population. We study the complex networks between multiple expressed phenotypes and genetic variants. Since the number of genetic variants in the phenome-wide association is quite limited and hence the size of the genotype-phenotype network is limited, the required computational time of construction of genotype-phenotype networks using causal inference is in the range the current computer system can reach. Advances in biosensors and sequencing technologies generate large amounts of phenotype and genetic data. Causal genetic and epigenetic network analysis may emerge as a new paradigm of genetic studies of complex traits. The main purpose of this editorial is to stimulate discussion about what are the optimal strategies to facilitate the development of a new generation of genetic analysis. I hope that more and more real data analysis in the future will greatly increase the confidence in causal inference for genotype-phenotype studies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信