Comparing Ancestry Standardization Approaches for a Transancestry Colorectal Cancer Polygenic Risk Score

IF 1.7 4区 医学 Q3 GENETICS & HEREDITY
Elisabeth A. Rosenthal, Li Hsu, Minta Thomas, Ulrike Peters, Christopher Kachulis, Karynne Patterson, Gail P. Jarvik
{"title":"Comparing Ancestry Standardization Approaches for a Transancestry Colorectal Cancer Polygenic Risk Score","authors":"Elisabeth A. Rosenthal,&nbsp;Li Hsu,&nbsp;Minta Thomas,&nbsp;Ulrike Peters,&nbsp;Christopher Kachulis,&nbsp;Karynne Patterson,&nbsp;Gail P. Jarvik","doi":"10.1002/gepi.22590","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Colorectal cancer (CRC) is a complex disease with monogenic, polygenic and environmental risk factors. Polygenic risk scores (PRSs) aim to identify high polygenic risk individuals. Due to differences in genetic background, PRS distributions vary by ancestry, necessitating standardization. We compared four <i>post-hoc</i> methods using the All of Us Research Program Whole Genome Sequence data for a transancestry CRC PRS. We contrasted results from linear models trained on A. the entire data or an ancestrally diverse subset AND B. covariates including principal components of ancestry or admixture. Standardization with the training subset also adjusted the variance. All methods performed similarly within ancestry, OR (95% C.I.) per s.d. change in PRS: African 1.5 (1.02, 2.08), Admixed American 2.2 (1.27, 3.85), European 1.6 (1.43, 1.89), and Middle Eastern 1.1 (0.71, 1.63). Using admixture and an ancestrally diverse training set provided distributions closest to standard Normal. Training a model on ancestrally diverse participants, adjusting both the mean and variance using admixture as covariates, created standard Normal <i>z</i>-scores, which can be used to identify patients at high polygenic risk. These scores can be incorporated into comprehensive risk calculation including other known risk factors, allowing for more precise risk estimates.</p>\n </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetic Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/gepi.22590","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Colorectal cancer (CRC) is a complex disease with monogenic, polygenic and environmental risk factors. Polygenic risk scores (PRSs) aim to identify high polygenic risk individuals. Due to differences in genetic background, PRS distributions vary by ancestry, necessitating standardization. We compared four post-hoc methods using the All of Us Research Program Whole Genome Sequence data for a transancestry CRC PRS. We contrasted results from linear models trained on A. the entire data or an ancestrally diverse subset AND B. covariates including principal components of ancestry or admixture. Standardization with the training subset also adjusted the variance. All methods performed similarly within ancestry, OR (95% C.I.) per s.d. change in PRS: African 1.5 (1.02, 2.08), Admixed American 2.2 (1.27, 3.85), European 1.6 (1.43, 1.89), and Middle Eastern 1.1 (0.71, 1.63). Using admixture and an ancestrally diverse training set provided distributions closest to standard Normal. Training a model on ancestrally diverse participants, adjusting both the mean and variance using admixture as covariates, created standard Normal z-scores, which can be used to identify patients at high polygenic risk. These scores can be incorporated into comprehensive risk calculation including other known risk factors, allowing for more precise risk estimates.

比较跨宗族结直肠癌多基因风险评分的宗族标准化方法。
结直肠癌(CRC)是一种复杂的疾病,具有单基因、多基因和环境风险因素。多基因风险评分(PRS)旨在识别多基因高风险个体。由于遗传背景的差异,PRS 的分布因血统而异,因此有必要进行标准化。我们使用 "全人类研究计划 "的全基因组序列数据比较了四种用于跨血统 CRC PRS 的事后分析方法。我们对比了 A. 整个数据或祖先多样性子集和 B. 辅变量(包括祖先或混血的主成分)所训练的线性模型的结果。用训练子集进行标准化还可以调整方差。所有方法在祖先、PRS 每 s.d. 变化的 OR(95% C.I.)方面的表现相似:非洲人 1.5 (1.02, 2.08),混血美国人 2.2 (1.27, 3.85),欧洲人 1.6 (1.43, 1.89),中东人 1.1 (0.71, 1.63)。使用掺杂和祖先多样化的训练集提供了最接近标准正态分布的分布。对祖先多样化的参与者进行模型训练,使用掺杂作为协变量来调整均值和方差,可得到标准正态 Z 值,用于识别多基因高风险患者。这些分数可以纳入包括其他已知风险因素在内的综合风险计算中,从而得出更精确的风险估计值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Genetic Epidemiology
Genetic Epidemiology 医学-公共卫生、环境卫生与职业卫生
CiteScore
4.40
自引率
9.50%
发文量
49
审稿时长
6-12 weeks
期刊介绍: Genetic Epidemiology is a peer-reviewed journal for discussion of research on the genetic causes of the distribution of human traits in families and populations. Emphasis is placed on the relative contribution of genetic and environmental factors to human disease as revealed by genetic, epidemiological, and biologic investigations. Genetic Epidemiology primarily publishes papers in statistical genetics, a research field that is primarily concerned with development of statistical, bioinformatical, and computational models for analyzing genetic data. Incorporation of underlying biology and population genetics into conceptual models is favored. The Journal seeks original articles comprising either applied research or innovative statistical, mathematical, computational, or genomic methodologies that advance studies in genetic epidemiology. Other types of reports are encouraged, such as letters to the editor, topic reviews, and perspectives from other fields of research that will likely enrich the field of genetic epidemiology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信