Fatemeh Yavartanoo, Myriam Brossard, Shelley B. Bull, Andrew D. Paterson, Yun Joo Yoo
{"title":"Dimension Reduction Using Local Principal Components for Regression-Based Multi-SNP Analysis in 1000 Genomes and the Canadian Longitudinal Study on Aging (CLSA)","authors":"Fatemeh Yavartanoo, Myriam Brossard, Shelley B. Bull, Andrew D. Paterson, Yun Joo Yoo","doi":"10.1002/gepi.70005","DOIUrl":"https://doi.org/10.1002/gepi.70005","url":null,"abstract":"<div>\u0000 \u0000 <p>For genetic association analysis based on multiple SNP regression of genotypes obtained by dense DNA sequencing or array data imputation, multi-collinearity can be a severe issue causing failure to fit the regression model. In this study, we propose a method of Dimension Reduction using Local Principal Components (DRLPC) which aims to resolve multi-collinearity by removing SNPs under the assumption that the remaining SNPs can capture the effect of a removed SNP due to high linear dependency. This approach to dimension reduction is expected to improve the power of regression-based statistical tests. We apply DRLPC to chromosome 22 SNPs of two data sets, the 1000 Genomes Project (phase 3) and the Canadian Longitudinal Study on Aging (CLSA), and calculate variance inflation factors (VIF) in various SNP-sets before and after implementing DRLPC as a metric of collinearity. Notably, DRLPC addresses multi-collinearity by excluding variables with a VIF exceeding a predetermined threshold (VIF = 20), thereby improving applicability for subsequent regression analyses. The number of variables in a final set for regression analysis is reduced to around 20% on average for larger-sized genes, whereas for smaller ones, the proportion is around 48%; suggesting that DRLPC is particularly effective for larger genes. We also compare the power of several multi-SNP statistics constructed for gene-specific analysis to evaluate power gains achieved by DRLPC. In simulation studies based on 100 genes with ≤ 500 SNPs per gene, DRLPC increases the power of the multiple regression Wald test from 60% to around 80%.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143521945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sex-Specific Association Between Polymorphisms in Estrogen Receptor Alpha Gene (ESR1) and Depression: A Genome-Wide Association Study of All of Us and UK Biobank Data","authors":"Yue Hu, Menglu Che, Heping Zhang","doi":"10.1002/gepi.70004","DOIUrl":"https://doi.org/10.1002/gepi.70004","url":null,"abstract":"<div>\u0000 \u0000 <p>Major depressive disorder (MDD) is prevalent worldwide, substantially and negatively impacting both the quality and length of life of 280 million people globally. The genetic risk factors of MDD have been studied in various previous research, but the findings lack consistency. Sex/gender and racial/ethnic disparities have been reported; however, many previous genetic studies, represented by large-scale genome-wide association studies (GWASs) are known to lack diversity in the study cohorts. All of Us is a biorepository aiming to focus on the historically underrepresented groups. We perform GWASs for the MDD phenotype, using over 200,000 participants' genotypes and carry out sex- and racial/ethnic-specific subgroup studies. We identified a risk locus (chr6:151945242) in Estrogen Receptor Alpha Gene (<i>ESR1</i>) (<i>p</i> = <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 \u0000 <mrow>\u0000 <mn>1.70</mn>\u0000 \u0000 <mo>×</mo>\u0000 \u0000 <msup>\u0000 <mn>10</mn>\u0000 \u0000 <mrow>\u0000 <mo>−</mo>\u0000 \u0000 <mn>9</mn>\u0000 </mrow>\u0000 </msup>\u0000 </mrow>\u0000 </mrow>\u0000 <annotation> $1.70times {10}^{-9}$</annotation>\u0000 </semantics></math>), and further confirmed the genetic association is sex-specific. The single-nucleotide polymorphism (SNP) chr6:151945242 was significant only in the male group, but not in the female group. These findings were replicated in the UK Biobank and echo with existing studies on the <i>ESR1</i> gene and depressive disorders. Our results indicate that the All of Us program is a reliable resource for GWAS, as well as shedding light on further investigation of sex- and racial/ethnic-specific genome association, especially in underrepresented groups of the US population.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reference-Based Standardization Approach Stabilizing Small Batch Risk Prediction via Polygenic Score","authors":"Yoichi Sutoh, Tsuyoshi Hachiya, Yayoi Otsuka-Yamasaki, Tomoharu Tokutomi, Akiko Yoshida, Yuka Kotozaki, Shohei Komaki, Shiori Minabe, Hideki Ohmomo, Kozo Tanno, Akimune Fukushima, Makoto Sasaki, Atsushi Shimizu","doi":"10.1002/gepi.70002","DOIUrl":"10.1002/gepi.70002","url":null,"abstract":"<div>\u0000 \u0000 <p>The polygenic score (PGS) holds promise for motivating preventive behavioral changes. However, no clinically validated standardization methodology currently exists. Here, we demonstrate the efficacy of a “reference-based” approach for standardization. This method uses the PGS distribution in the general population as a reference for normalization and percentile determination; however, it has not been validated. We investigated three potential influences on PGS computation: (1) the size of the reference population, (2) biases associated with different genotyping platforms, and (3) inclusion of kinship ties within the reference group. Our results indicate that the reference size affects the bootstrap estimate of standard error for PGS percentiles, peaking around the 50th percentile and diminishing at extreme percentiles (1st or 100th). Discrepancies between genotyping platforms, such as different microarrays and whole-genome sequencing, resulted in deviations in PGS (<i>p</i> < 0.05 in Kolmogorov–Smirnov test). However, these deviations were reduced to a nonsignificant level using shared genetic variants in the calculations when the ancestry of the samples and reference were matched. This approach recovered approximately 9.6% of the positive predictive value of PGS by naïve genotype. Our results provide fundamental insights for establishing clinical guidelines for implementing PGS to communicate reliable risks to individuals.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143065247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Justine Po, John Morrison, Brittney Marian, Zhanghua Chen, W. James Gauderman, Erika Garcia
{"title":"Gene−Air Pollution Interaction and Diversity of Genetic Sampling: The Southern California Children's Health Study","authors":"Justine Po, John Morrison, Brittney Marian, Zhanghua Chen, W. James Gauderman, Erika Garcia","doi":"10.1002/gepi.70000","DOIUrl":"10.1002/gepi.70000","url":null,"abstract":"<div>\u0000 \u0000 <p>Gene−environment interactions have been observed for childhood asthma, however few have been assessed in ethnically diverse populations. Thus, we examined how polygenic risk score (PRS) modifies the association between ambient air pollution exposure (nitrogen dioxide [NO<sub>2</sub>], ozone, particulate matter < 2.5 and < 10 μm) and childhood asthma incidence in a diverse cohort. Participants (<i>n</i> = 1794) were drawn from the Southern California Children's Health Study, a multi-wave prospective cohort followed from 4th to 12th grade. PRS was developed using single nucleotide polymorphisms previously associated with childhood asthma. PRS−asthma associations and PRS−air pollutant interactions were estimated using Poisson regression. An interquartile range PRS increase was associated with 36% (95% CI: 9%, 70%) higher asthma incidence among non-Hispanic children, but not associated with asthma among Hispanic children (rate ratio: 0.81 [95% CI: 0.62, 1.04]). NO<sub>2</sub>−PRS interaction was borderline significant in the overall sample (coefficient: 0.23 [95% CI: −0.03, 0.49]). Limited evidence was observed for a positive interaction between PRS and NO<sub>2</sub> exposure associated with asthma incidence; however, the literature-based PRS was not associated with asthma among Hispanic participants. Equitable, diverse genetic sampling approaches are needed to better identify clinically relevant SNPs in this population.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143046386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Statistical Method for Unmasking Sex-Specific Genomics Signatures in Complex Traits","authors":"Samaneh Mansouri, Mélissa Rochette, Benoit Labonté, Qingrun Zhang, Ting-Huei Chen","doi":"10.1002/gepi.22612","DOIUrl":"10.1002/gepi.22612","url":null,"abstract":"<div>\u0000 \u0000 <p>Genotype–phenotype association studies have advanced our understanding of complex traits but often overlook sex-specific genetic signals. The growing awareness of sex-specific influences on human traits and diseases necessitates tailored statistical methodologies to dissect these genetic intricacies. Rare genetic variants play a significant role in disease development, often exhibiting stronger per-allele effects than common variants. In sex-dimorphic analysis, traits are viewed as having two sex-specific subsets rather than being uniformly defined. Existing methods for gene-based analysis of rare variants across multiple traits can identify shared genetic signals but cannot reveal the specific subsets from which significant signals originate. This means that when a significant signal is detected, it remains unclear whether it arises from the male samples, female samples, or both. To address this limitation, we propose <i>SubsetRV</i>, a new methodology capable of identifying genes associated with specific traits or diseases in males, females, or both. <i>SubsetRV</i> can also be applied to broader applications in multiple traits analysis. Simulation studies have demonstrated <i>SubsetRV</i>'s reliability, and real data analysis on bipolar disorder and schizophrenia has revealed potential sex-specific genetic signals. <i>SubsetRV</i> offers a valuable tool for identifying sex-specific genetic candidates, aiding in understanding disease mechanisms. An R package for <i>SubsetRV</i> is available on GitHub. It can be accessed directly through this link: https://github.com/Mansouri-S/SubsetRV.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143004344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying Disease Associated Multi-Omics Network With Mixed Graphical Models Based on Markov Random Field Model","authors":"Jaehyun Park, Sungho Won","doi":"10.1002/gepi.22605","DOIUrl":"10.1002/gepi.22605","url":null,"abstract":"<div>\u0000 \u0000 <p>In this article, we proposed a new method named fused mixed graphical model (FMGM), which can infer network structures associated with dichotomous phenotypes. FMGM is based on a pairwise Markov random field model, and statistical analyses including the proposed method were conducted to find biological markers and underlying network structures of the atopic dermatitis (AD) from multiomics data of 6-month-old infants. The performance of FMGM was evaluated with simulations by using synthetic datasets of power-law networks, showing that FMGM had superior performance for identifying the differences of the networks compared to the separate inference with the previous method, causalMGM (F1-scores 0.550 vs. 0.730). Furthermore, FMGM was applied to identify multiomics profiles associated with AD, and significance association was found for the correlation between carotenoid biosynthesis and RNA degradation, suggesting the importance of metabolism related to oxidative stress and microbial RNA balance. R codes can be accessed as an R package “fusedMGM,” and an example data set and a script for analyses can be found at http://figshare.com/articles/dataset/FMGM_synthetic_data_example_zip/20509113.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142983309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikhil K. Khankari, Timothy Su, Qiuyin Cai, Lili Liu, Elizabeth A. Jasper, Jacklyn N. Hellwege, Harvey J. Murff, Martha J. Shrubsole, Jirong Long, Todd L. Edwards, Wei Zheng
{"title":"Genetically Predicted Gene Expression Effects on Changes in Red Blood Cell and Plasma Polyunsaturated Fatty Acids","authors":"Nikhil K. Khankari, Timothy Su, Qiuyin Cai, Lili Liu, Elizabeth A. Jasper, Jacklyn N. Hellwege, Harvey J. Murff, Martha J. Shrubsole, Jirong Long, Todd L. Edwards, Wei Zheng","doi":"10.1002/gepi.22613","DOIUrl":"10.1002/gepi.22613","url":null,"abstract":"<p>Polyunsaturated fatty acids (PUFAs) including omega-3 and omega-6 are obtained from diet and can be measured objectively in plasma or red blood cells (RBCs) membrane biomarkers, representing different dietary exposure windows. In vivo conversion of omega-3 and omega-6 PUFAs from short- to long-chain counterparts occurs via a shared metabolic pathway involving fatty acid desaturases and elongase. This analysis leveraged genome-wide association study (GWAS) summary statistics for RBC and plasma PUFAs, along with expression quantitative trait loci (eQTL) to estimate tissue-specific genetically predicted gene expression effects for delta-5 desaturase (<i>FADS1</i>), delta-6 desaturase (<i>FADS2</i>), and elongase (<i>ELOVL2</i>) on changes in RBC and plasma biomarkers. Using colocalization, we identified shared variants associated with both increased gene expression and changes in RBC PUFA levels in relevant PUFA metabolism tissues (i.e., adipose, liver, muscle, and whole blood). We observed differences in RBC versus plasma PUFA levels for genetically predicted increase in <i>FADS1</i> and <i>FADS2</i> gene expression, primarily for omega-6 PUFAs linoleic acid (LA) and arachidonic acid (AA). The colocalization analysis identified rs102275 to be significantly associated with a 0.69% increase in total RBC membrane-bound LA levels (<i>p</i> = 5.4 × 10<sup>−12</sup>). Future PUFA genetic studies examining long-term PUFA biomarkers are needed to confirm our results.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11734643/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142983307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joe Gilbody, Maria Carolina Borges, George Davey Smith, Eleanor Sanderson
{"title":"Multivariable MR Can Mitigate Bias in Two-Sample MR Using Covariable-Adjusted Summary Associations","authors":"Joe Gilbody, Maria Carolina Borges, George Davey Smith, Eleanor Sanderson","doi":"10.1002/gepi.22606","DOIUrl":"10.1002/gepi.22606","url":null,"abstract":"<p>Genome-wide association studies (GWAS) are hypothesis-free studies that estimate the association between polymorphisms across the genome with a trait of interest. To increase power and to estimate the direct effects of these single-nucleotide polymorphisms (SNPs) on a trait GWAS are often conditioned on a covariate (such as body mass index or smoking status). This adjustment can introduce bias in the estimated effect of the SNP on the trait. Two-sample Mendelian randomisation (MR) studies use summary statistics from GWAS estimate the causal effect of a risk factor (or exposure) on an outcome. Covariate adjustment in GWAS can bias the effect estimates obtained from MR studies conducted using covariate adjusted GWAS data. Multivariable MR (MVMR) is an extension of MR that includes multiple traits as exposures. Here we propose the use of MVMR to correct the bias in MR studies from covariate adjustment. We show how MVMR can recover unbiased estimates of the direct effect of the exposure of interest by including the covariate used to adjust the GWAS within the analysis. We apply this method to estimate the effect of systolic blood pressure on type-2 diabetes and the effect of waist circumference on systolic blood pressure. Our analytical and simulation results show that MVMR provides unbiased effect estimates for the exposure when either the exposure or outcome of interest has been adjusted for a covariate. Our results also highlight the parameters that determine when MR will be biased by GWAS covariate adjustment. The results from the applied analysis mirror these results, with equivalent results seen in the MVMR with and without adjusted GWAS. When GWAS results have been adjusted for a covariate, biasing MR effect estimates, direct effect estimates of an exposure on an outcome can be obtained by including that covariate as an additional exposure in an MVMR estimation. However, the estimated effect of the covariate obtained from the MVMR estimation is biased.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11734645/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142983311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Inti Pagnuco, Stephen Eyre, Magnus Rattray, Andrew P. Morris
{"title":"Transferability of Single- and Cross-Tissue Transcriptome Imputation Models Across Ancestry Groups","authors":"Inti Pagnuco, Stephen Eyre, Magnus Rattray, Andrew P. Morris","doi":"10.1002/gepi.22611","DOIUrl":"10.1002/gepi.22611","url":null,"abstract":"<p>Transcriptome-wide association studies (TWAS) investigate the links between genetically regulated gene expression and complex traits. TWAS involves imputing gene expression using expression quantitative trait loci (eQTL) as predictors and testing the association between the imputed expression and the trait. The effectiveness of TWAS depends on the accuracy of these imputation models, which require genotype and gene expression data from the same samples. However, publicly accessible resources, such as the Genotype Tissue Expression (GTEx) Project, are biased toward individuals of European ancestry, potentially reducing prediction accuracy into other ancestry groups. This study explored eQTL transferability across ancestry groups by comparing two imputation models: PrediXcan (tissue-specific) and UTMOST (cross-tissue). Both models were trained on tissues from the GTEx Project using European ancestry individuals and then tested on data sets of European ancestry and African American individuals. Results showed that both models performed best when the training and testing data sets were from the same ancestry group, with the cross-tissue approach generally outperforming the tissue-specific approach. This study underscores that eQTL detection is influenced by ancestry and tissue context. Developing ancestry-specific reference panels across tissues can improve prediction accuracy, enhancing TWAS analysis and our understanding of the biological processes contributing to complex traits.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11734644/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142983313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amarise Little, Ni Zhao, Anna Mikhaylova, Angela Zhang, Wodan Ling, Florian Thibord, Andrew D. Johnson, Laura M. Raffield, Joanne E. Curran, John Blangero, Jeffrey R. O'Connell, Huichun Xu, Jerome I. Rotter, Stephen S. Rich, Kenneth M. Rice, Ming-Huei Chen, Alexander Reiner, Charles Kooperberg, Thao Vu, Lifang Hou, Myriam Fornage, Ruth J.F. Loos, Eimear Kenny, Rasika Mathias, Lewis Becker, Albert V. Smith, Eric Boerwinkle, Bing Yu, Timothy Thornton, Michael C. Wu
{"title":"General Kernel Machine Methods for Multi-Omics Integration and Genome-Wide Association Testing With Related Individuals","authors":"Amarise Little, Ni Zhao, Anna Mikhaylova, Angela Zhang, Wodan Ling, Florian Thibord, Andrew D. Johnson, Laura M. Raffield, Joanne E. Curran, John Blangero, Jeffrey R. O'Connell, Huichun Xu, Jerome I. Rotter, Stephen S. Rich, Kenneth M. Rice, Ming-Huei Chen, Alexander Reiner, Charles Kooperberg, Thao Vu, Lifang Hou, Myriam Fornage, Ruth J.F. Loos, Eimear Kenny, Rasika Mathias, Lewis Becker, Albert V. Smith, Eric Boerwinkle, Bing Yu, Timothy Thornton, Michael C. Wu","doi":"10.1002/gepi.22610","DOIUrl":"10.1002/gepi.22610","url":null,"abstract":"<div>\u0000 \u0000 <p>Integrating multi-omics data may help researchers understand the genetic underpinnings of complex traits and diseases. However, the best ways to integrate multi-omics data and use them to address pressing scientific questions remain a challenge. One important and topical problem is how to assess the aggregate effect of multiple genomic data types (e.g. genotypes and gene expression levels) on a phenotype, particularly while accommodating routine issues, such as having related subjects' data in analyses. In this paper, we extend an existing composite kernel machine regression model to integrate two multi-omics data types, while accommodating for general correlation structures amongst outcomes. Due to the kernel machine regression framework, our methods allow for the integration of high-dimensional omics data with small, nonlinear, and interactive effects, and accommodation of general study designs. Here, we focus on scientific questions that aim to assess the association between a functional grouping (such as a gene or a pathway) and a quantitative trait of interest. We use a kernel machine regression to integrate the two multi-omics data types, as they may relate to the trait, and perform a global test of association. We demonstrate the advantage of this approach over single data type association tests via simulation. Finally, we apply this method to a large, multi-ethnic data set to investigate how predicted gene expression and rare genetic variation may be related to two platelet traits.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142983296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}