inMTSCCA：多组脑成像遗传学的综合多任务稀疏典型相关分析。

IF 7.9 2区生物学 Q1 GENETICS & HEREDITY

Genomics, Proteomics & Bioinformatics Pub Date : 2023-04-01 DOI:10.1016/j.gpb.2023.03.005

Lei Du, Jin Zhang, Ying Zhao, Muheng Shang, Lei Guo, Junwei Han, The Alzheimer's Disease Neuroimaging Initiative

{"title":"inMTSCCA：多组脑成像遗传学的综合多任务稀疏典型相关分析。","authors":"Lei Du, Jin Zhang, Ying Zhao, Muheng Shang, Lei Guo, Junwei Han, The Alzheimer's Disease Neuroimaging Initiative","doi":"10.1016/j.gpb.2023.03.005","DOIUrl":null,"url":null,"abstract":"<div>Identifying genetic risk factors for Alzheimer’s disease (AD) is an important research topic. To date, different endophenotypes, such as imaging-derived endophenotypes and proteomic expression-derived endophenotypes, have shown the great value in uncovering risk genes compared to case–control studies. Biologically, a co-varying pattern of different omics-derived endophenotypes could result from the shared genetic basis. However, existing methods mainly focus on the effect of endophenotypes alone; the effect of cross-endophenotype (CEP) associations remains largely unexploited. In this study, we used both endophenotypes and their CEP associations of multi-omic data to identify genetic risk factors, and proposed two integrated multi-task sparse canonical correlation analysis (inMTSCCA) methods, i.e., pairwise endophenotype correlation-guided MTSCCA (pcMTSCCA) and high-order endophenotype correlation-guided MTSCCA (hocMTSCCA). pcMTSCCA employed pairwise correlations between magnetic resonance imaging (MRI)-derived, plasma-derived, and cerebrospinal fluid (CSF)-derived endophenotypes as an additional penalty. hocMTSCCA used high-order correlations among these multi-omic data for regularization. To figure out genetic risk factors at individual and group levels, as well as altered endophenotypic markers, we introduced sparsity-inducing penalties for both models. We compared pcMTSCCA and hocMTSCCA with three related methods on both simulation and real (consisting of neuroimaging data, proteomic analytes, and genetic data) datasets. The results showed that our methods obtained better or comparable canonical correlation coefficients (CCCs) and better feature subsets than benchmarks. Most importantly, the identified genetic loci and heterogeneous endophenotypic markers showed high relevance. Therefore, jointly using multi-omic endophenotypes and their CEP associations is promising to reveal genetic risk factors. The source code and manual of inMTSCCA are available at https://ngdc.cncb.ac.cn/biocode/tools/BT007330<svg><path></path></svg>.</div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 2","pages":"Pages 396-413"},"PeriodicalIF":7.9000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"inMTSCCA: An Integrated Multi-task Sparse Canonical Correlation Analysis for Multi-omic Brain Imaging Genetics\",\"authors\":\"Lei Du, Jin Zhang, Ying Zhao, Muheng Shang, Lei Guo, Junwei Han, The Alzheimer's Disease Neuroimaging Initiative\",\"doi\":\"10.1016/j.gpb.2023.03.005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>Identifying genetic risk factors for Alzheimer’s disease (AD) is an important research topic. To date, different endophenotypes, such as imaging-derived endophenotypes and proteomic expression-derived endophenotypes, have shown the great value in uncovering risk genes compared to case–control studies. Biologically, a co-varying pattern of different omics-derived endophenotypes could result from the shared genetic basis. However, existing methods mainly focus on the effect of endophenotypes alone; the effect of cross-endophenotype (CEP) associations remains largely unexploited. In this study, we used both endophenotypes and their CEP associations of multi-omic data to identify genetic risk factors, and proposed two integrated multi-task sparse canonical correlation analysis (inMTSCCA) methods, i.e., pairwise endophenotype correlation-guided MTSCCA (pcMTSCCA) and high-order endophenotype correlation-guided MTSCCA (hocMTSCCA). pcMTSCCA employed pairwise correlations between magnetic resonance imaging (MRI)-derived, plasma-derived, and cerebrospinal fluid (CSF)-derived endophenotypes as an additional penalty. hocMTSCCA used high-order correlations among these multi-omic data for regularization. To figure out genetic risk factors at individual and group levels, as well as altered endophenotypic markers, we introduced sparsity-inducing penalties for both models. We compared pcMTSCCA and hocMTSCCA with three related methods on both simulation and real (consisting of neuroimaging data, proteomic analytes, and genetic data) datasets. The results showed that our methods obtained better or comparable canonical correlation coefficients (CCCs) and better feature subsets than benchmarks. Most importantly, the identified genetic loci and heterogeneous endophenotypic markers showed high relevance. Therefore, jointly using multi-omic endophenotypes and their CEP associations is promising to reveal genetic risk factors. The source code and manual of inMTSCCA are available at https://ngdc.cncb.ac.cn/biocode/tools/BT007330<svg><path></path></svg>.</div>\",\"PeriodicalId\":12528,\"journal\":{\"name\":\"Genomics, Proteomics & Bioinformatics\",\"volume\":\"21 2\",\"pages\":\"Pages 396-413\"},\"PeriodicalIF\":7.9000,\"publicationDate\":\"2023-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genomics, Proteomics & Bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1672022923000943\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics, Proteomics & Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1672022923000943","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}

引用次数: 0

摘要

识别阿尔茨海默病（AD）的遗传危险因素是一个重要的研究课题。到目前为止，与病例对照研究相比，不同的内表型，如成像衍生的内表型和蛋白质组表达衍生的内血型，在揭示风险基因方面显示出巨大的价值。在生物学上，不同组学衍生的内表型的共同变化模式可能是由共同的遗传基础造成的。然而，现有的方法主要集中于内表型单独的影响；交叉内表型（CEP）关联的作用在很大程度上仍未被利用。在这项研究中，我们使用多组数据的内表型及其CEP关联来识别遗传风险因素，并提出了两种集成的多任务稀疏典型相关分析（inMTSCCA）方法，即成对内表型相关引导的MTSCCA（pcMTSCCA。pcMTSCCA采用磁共振成像（MRI）衍生的、血浆衍生的和脑脊液（CSF）衍生的内表型之间的成对相关性作为额外的惩罚。hocMTSCCA使用这些多组数据之间的高阶相关性进行正则化。为了找出个体和群体水平的遗传风险因素，以及改变的内表型标记，我们对两个模型都引入了稀疏性诱导惩罚。我们在模拟和真实数据集（包括神经成像数据、蛋白质组分析和遗传数据）上比较了pcMTSCCA和hocMTSCCA与三种相关方法。结果表明，与基准测试相比，我们的方法获得了更好或可比的正则相关系数和更好的特征子集。最重要的是，已鉴定的遗传位点和异质性内表型标记显示出高度相关性。因此，联合使用多组体内表型及其CEP关联有望揭示遗传风险因素。inMTSCCA的源代码和手册可在https://ngdc.cncb.ac.cn/biocode/tools/BT007330.

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

inMTSCCA: An Integrated Multi-task Sparse Canonical Correlation Analysis for Multi-omic Brain Imaging Genetics

Identifying genetic risk factors for Alzheimer’s disease (AD) is an important research topic. To date, different endophenotypes, such as imaging-derived endophenotypes and proteomic expression-derived endophenotypes, have shown the great value in uncovering risk genes compared to case–control studies. Biologically, a co-varying pattern of different omics-derived endophenotypes could result from the shared genetic basis. However, existing methods mainly focus on the effect of endophenotypes alone; the effect of cross-endophenotype (CEP) associations remains largely unexploited. In this study, we used both endophenotypes and their CEP associations of multi-omic data to identify genetic risk factors, and proposed two integrated multi-task sparse canonical correlation analysis (inMTSCCA) methods, i.e., pairwise endophenotype correlation-guided MTSCCA (pcMTSCCA) and high-order endophenotype correlation-guided MTSCCA (hocMTSCCA). pcMTSCCA employed pairwise correlations between magnetic resonance imaging (MRI)-derived, plasma-derived, and cerebrospinal fluid (CSF)-derived endophenotypes as an additional penalty. hocMTSCCA used high-order correlations among these multi-omic data for regularization. To figure out genetic risk factors at individual and group levels, as well as altered endophenotypic markers, we introduced sparsity-inducing penalties for both models. We compared pcMTSCCA and hocMTSCCA with three related methods on both simulation and real (consisting of neuroimaging data, proteomic analytes, and genetic data) datasets. The results showed that our methods obtained better or comparable canonical correlation coefficients (CCCs) and better feature subsets than benchmarks. Most importantly, the identified genetic loci and heterogeneous endophenotypic markers showed high relevance. Therefore, jointly using multi-omic endophenotypes and their CEP associations is promising to reveal genetic risk factors. The source code and manual of inMTSCCA are available at https://ngdc.cncb.ac.cn/biocode/tools/BT007330.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Genomics, Proteomics & Bioinformatics Biochemistry, Genetics and Molecular Biology-Biochemistry

CiteScore

14.30

自引率

4.20%

发文量

844

审稿时长

61 days

期刊介绍： Genomics, Proteomics and Bioinformatics (GPB) is the official journal of the Beijing Institute of Genomics, Chinese Academy of Sciences / China National Center for Bioinformation and Genetics Society of China. It aims to disseminate new developments in the field of omics and bioinformatics, publish high-quality discoveries quickly, and promote open access and online publication. GPB welcomes submissions in all areas of life science, biology, and biomedicine, with a focus on large data acquisition, analysis, and curation. Manuscripts covering omics and related bioinformatics topics are particularly encouraged. GPB is indexed/abstracted by PubMed/MEDLINE, PubMed Central, Scopus, BIOSIS Previews, Chemical Abstracts, CSCD, among others.