Transfer learning reveals the mediating mechanisms of cross-ethnic lipid metabolic pathways in the association between APOE gene and Alzheimer's disease.
{"title":"Transfer learning reveals the mediating mechanisms of cross-ethnic lipid metabolic pathways in the association between APOE gene and Alzheimer's disease.","authors":"Lulu Pan, Yahang Liu, Chen Huang, Ruilang Lin, Yongfu Yu, Guoyou Qin","doi":"10.1093/bib/bbaf460","DOIUrl":null,"url":null,"abstract":"<p><p>Lipid-mediated effects play a crucial role in elucidating the pathological mechanisms linking the ε4 allele of the apolipoprotein E gene (APOE ε4) to Alzheimer's disease (AD). However, traditional mediation analysis methods often suffer from insufficient statistical power in studies involving minority populations due to limited sample sizes. This study innovatively develops a high-dimensional mediation analysis model (TransHDM) based on a transfer learning framework. By leveraging information from source data with large-scale samples, it significantly enhances the ability to identify potential mediators in small sample target data. The method first constructs a high-dimensional regression model using aggregated data from the source data and target data, then applies transfer regularization to adjust for heterogeneity between the source and target domains, correcting for estimation bias in high-dimensional Lasso. Ultimately, it achieves parameter transfer across domains, addressing statistical bias and inferential uncertainty caused by small sample sizes. Simulation results demonstrate that, compared to traditional methods, this approach significantly improves the power in identifying true mediator variables while effectively controlling the family-wise error rate in multiple testing. When applied to the Alzheimer's Disease Neuroimaging Initiative cohort, TransHDM transferred large-scale data from white and other ethnic groups, identifying additional lipid metabolic pathways mediating the influence of the APOE ε4 allele on AD pathological progression in African American populations compared to pre-transfer analysis. These pathways include glycerophospholipid metabolism, glycerolipid metabolism, sphingolipid metabolism, and ether lipid metabolism (false discovery rate < 0.05). The TransHDM framework not only provides a powerful methodological tool for small sample population research but also offers valuable insights for future research in exploring disease mechanisms and developing biomarkers for disease prediction.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7000,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12445873/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf460","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Lipid-mediated effects play a crucial role in elucidating the pathological mechanisms linking the ε4 allele of the apolipoprotein E gene (APOE ε4) to Alzheimer's disease (AD). However, traditional mediation analysis methods often suffer from insufficient statistical power in studies involving minority populations due to limited sample sizes. This study innovatively develops a high-dimensional mediation analysis model (TransHDM) based on a transfer learning framework. By leveraging information from source data with large-scale samples, it significantly enhances the ability to identify potential mediators in small sample target data. The method first constructs a high-dimensional regression model using aggregated data from the source data and target data, then applies transfer regularization to adjust for heterogeneity between the source and target domains, correcting for estimation bias in high-dimensional Lasso. Ultimately, it achieves parameter transfer across domains, addressing statistical bias and inferential uncertainty caused by small sample sizes. Simulation results demonstrate that, compared to traditional methods, this approach significantly improves the power in identifying true mediator variables while effectively controlling the family-wise error rate in multiple testing. When applied to the Alzheimer's Disease Neuroimaging Initiative cohort, TransHDM transferred large-scale data from white and other ethnic groups, identifying additional lipid metabolic pathways mediating the influence of the APOE ε4 allele on AD pathological progression in African American populations compared to pre-transfer analysis. These pathways include glycerophospholipid metabolism, glycerolipid metabolism, sphingolipid metabolism, and ether lipid metabolism (false discovery rate < 0.05). The TransHDM framework not only provides a powerful methodological tool for small sample population research but also offers valuable insights for future research in exploring disease mechanisms and developing biomarkers for disease prediction.
期刊介绍:
Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data.
The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.