利用联合稀疏典型相关分析确定与疾病结果相关的基因--在肾透明细胞癌中的应用

IF 1.7 4区 医学 Q3 GENETICS & HEREDITY
Diptavo Dutta, Ananda Sen, Jaya M. Satagopan
{"title":"利用联合稀疏典型相关分析确定与疾病结果相关的基因--在肾透明细胞癌中的应用","authors":"Diptavo Dutta,&nbsp;Ananda Sen,&nbsp;Jaya M. Satagopan","doi":"10.1002/gepi.22566","DOIUrl":null,"url":null,"abstract":"<p>Somatic changes like copy number aberrations (CNAs) and epigenetic alterations like methylation have pivotal effects on disease outcomes and prognosis in cancer, by regulating gene expressions, that drive critical biological processes. To identify potential biomarkers and molecular targets and understand how they impact disease outcomes, it is important to identify key groups of CNAs, the associated methylation, and the gene expressions they impact, through a joint integrative analysis. Here, we propose a novel analysis pipeline, the joint sparse canonical correlation analysis (jsCCA), an extension of sCCA, to effectively identify an ensemble of CNAs, methylation sites and gene (expression) components in the context of disease endpoints, especially tumor characteristics. Our approach detects potentially orthogonal gene components that are highly correlated with sets of methylation sites which in turn are correlated with sets of CNA sites. It then identifies the genes within these components that are associated with the outcome. Further, we aggregate the effect of each gene expression set on tumor stage by constructing “gene component scores” and test its interaction with traditional risk factors. Analyzing clinical and genomic data on 515 renal clear cell carcinoma (ccRCC) patients from the TCGA-KIRC, we found eight gene components to be associated with methylation sites, regulated by groups of proximally located CNA sites. Association analysis with tumor stage at diagnosis identified a novel association of expression of <i>ASAH1</i> gene trans-regulated by methylation of several genes including <i>SIX5</i> and by CNAs in the 10q25 region including <i>TCF7L2</i>. Further analysis to quantify the overall effect of gene sets on tumor stage, revealed that two of the eight gene components have significant interaction with smoking in relation to tumor stage. These gene components represent distinct biological functions including immune function, inflammatory responses, and hypoxia-regulated pathways. Our findings suggest that jsCCA analysis can identify interpretable and important genes, regulatory structures, and clinically consequential pathways. Such methods are warranted for comprehensive analysis of multimodal data especially in cancer genomics.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"48 8","pages":"414-432"},"PeriodicalIF":1.7000,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.22566","citationCount":"0","resultStr":"{\"title\":\"Identifying genes associated with disease outcomes using joint sparse canonical correlation analysis—An application in renal clear cell carcinoma\",\"authors\":\"Diptavo Dutta,&nbsp;Ananda Sen,&nbsp;Jaya M. Satagopan\",\"doi\":\"10.1002/gepi.22566\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Somatic changes like copy number aberrations (CNAs) and epigenetic alterations like methylation have pivotal effects on disease outcomes and prognosis in cancer, by regulating gene expressions, that drive critical biological processes. To identify potential biomarkers and molecular targets and understand how they impact disease outcomes, it is important to identify key groups of CNAs, the associated methylation, and the gene expressions they impact, through a joint integrative analysis. Here, we propose a novel analysis pipeline, the joint sparse canonical correlation analysis (jsCCA), an extension of sCCA, to effectively identify an ensemble of CNAs, methylation sites and gene (expression) components in the context of disease endpoints, especially tumor characteristics. Our approach detects potentially orthogonal gene components that are highly correlated with sets of methylation sites which in turn are correlated with sets of CNA sites. It then identifies the genes within these components that are associated with the outcome. Further, we aggregate the effect of each gene expression set on tumor stage by constructing “gene component scores” and test its interaction with traditional risk factors. Analyzing clinical and genomic data on 515 renal clear cell carcinoma (ccRCC) patients from the TCGA-KIRC, we found eight gene components to be associated with methylation sites, regulated by groups of proximally located CNA sites. Association analysis with tumor stage at diagnosis identified a novel association of expression of <i>ASAH1</i> gene trans-regulated by methylation of several genes including <i>SIX5</i> and by CNAs in the 10q25 region including <i>TCF7L2</i>. Further analysis to quantify the overall effect of gene sets on tumor stage, revealed that two of the eight gene components have significant interaction with smoking in relation to tumor stage. These gene components represent distinct biological functions including immune function, inflammatory responses, and hypoxia-regulated pathways. Our findings suggest that jsCCA analysis can identify interpretable and important genes, regulatory structures, and clinically consequential pathways. Such methods are warranted for comprehensive analysis of multimodal data especially in cancer genomics.</p>\",\"PeriodicalId\":12710,\"journal\":{\"name\":\"Genetic Epidemiology\",\"volume\":\"48 8\",\"pages\":\"414-432\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2024-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.22566\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genetic Epidemiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/gepi.22566\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetic Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/gepi.22566","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

摘要

拷贝数畸变(CNAs)等体细胞变化和甲基化等表观遗传学改变通过调控基因表达对癌症的疾病预后和预后有关键影响,而基因表达则驱动着关键的生物过程。为了确定潜在的生物标记物和分子靶标,并了解它们如何影响疾病预后,必须通过联合综合分析确定关键的 CNAs 组、相关的甲基化及其影响的基因表达。在这里,我们提出了一种新的分析管道,即联合稀疏典型相关分析(jsCCA),它是 sCCA 的扩展,可有效识别疾病终点(尤其是肿瘤特征)背景下的 CNAs、甲基化位点和基因(表达)成分组合。我们的方法能检测出与甲基化位点集高度相关的潜在正交基因成分,而甲基化位点集又与 CNA 位点集相关。然后找出这些成分中与结果相关的基因。此外,我们还通过构建 "基因成分分数 "来汇总每个基因表达集对肿瘤分期的影响,并测试其与传统风险因素的相互作用。通过分析 TCGA-KIRC 中 515 名肾透明细胞癌(ccRCC)患者的临床和基因组数据,我们发现有八个基因成分与甲基化位点相关,并受到近端 CNA 位点组的调控。与诊断时肿瘤分期的关联分析发现了一种新的关联,即 ASAH1 基因的表达受包括 SIX5 在内的几个基因的甲基化和包括 TCF7L2 在内的 10q25 区域的 CNAs 的转调。为量化基因组对肿瘤分期的整体影响而进行的进一步分析表明,在八个基因成分中,有两个与吸烟在肿瘤分期上有显著的相互作用。这些基因成分代表了不同的生物功能,包括免疫功能、炎症反应和缺氧调控通路。我们的研究结果表明,jsCCA 分析可以识别可解释的重要基因、调控结构和临床后果通路。这种方法适用于多模态数据的综合分析,尤其是在癌症基因组学领域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Identifying genes associated with disease outcomes using joint sparse canonical correlation analysis—An application in renal clear cell carcinoma

Identifying genes associated with disease outcomes using joint sparse canonical correlation analysis—An application in renal clear cell carcinoma

Somatic changes like copy number aberrations (CNAs) and epigenetic alterations like methylation have pivotal effects on disease outcomes and prognosis in cancer, by regulating gene expressions, that drive critical biological processes. To identify potential biomarkers and molecular targets and understand how they impact disease outcomes, it is important to identify key groups of CNAs, the associated methylation, and the gene expressions they impact, through a joint integrative analysis. Here, we propose a novel analysis pipeline, the joint sparse canonical correlation analysis (jsCCA), an extension of sCCA, to effectively identify an ensemble of CNAs, methylation sites and gene (expression) components in the context of disease endpoints, especially tumor characteristics. Our approach detects potentially orthogonal gene components that are highly correlated with sets of methylation sites which in turn are correlated with sets of CNA sites. It then identifies the genes within these components that are associated with the outcome. Further, we aggregate the effect of each gene expression set on tumor stage by constructing “gene component scores” and test its interaction with traditional risk factors. Analyzing clinical and genomic data on 515 renal clear cell carcinoma (ccRCC) patients from the TCGA-KIRC, we found eight gene components to be associated with methylation sites, regulated by groups of proximally located CNA sites. Association analysis with tumor stage at diagnosis identified a novel association of expression of ASAH1 gene trans-regulated by methylation of several genes including SIX5 and by CNAs in the 10q25 region including TCF7L2. Further analysis to quantify the overall effect of gene sets on tumor stage, revealed that two of the eight gene components have significant interaction with smoking in relation to tumor stage. These gene components represent distinct biological functions including immune function, inflammatory responses, and hypoxia-regulated pathways. Our findings suggest that jsCCA analysis can identify interpretable and important genes, regulatory structures, and clinically consequential pathways. Such methods are warranted for comprehensive analysis of multimodal data especially in cancer genomics.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Genetic Epidemiology
Genetic Epidemiology 医学-公共卫生、环境卫生与职业卫生
CiteScore
4.40
自引率
9.50%
发文量
49
审稿时长
6-12 weeks
期刊介绍: Genetic Epidemiology is a peer-reviewed journal for discussion of research on the genetic causes of the distribution of human traits in families and populations. Emphasis is placed on the relative contribution of genetic and environmental factors to human disease as revealed by genetic, epidemiological, and biologic investigations. Genetic Epidemiology primarily publishes papers in statistical genetics, a research field that is primarily concerned with development of statistical, bioinformatical, and computational models for analyzing genetic data. Incorporation of underlying biology and population genetics into conceptual models is favored. The Journal seeks original articles comprising either applied research or innovative statistical, mathematical, computational, or genomic methodologies that advance studies in genetic epidemiology. Other types of reports are encouraged, such as letters to the editor, topic reviews, and perspectives from other fields of research that will likely enrich the field of genetic epidemiology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信